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FUNCTIONAL DNA CLONE FOR HEPATITIS C VIRUS (HCV) 

AND USES THEREOF 



QOVPRNMENT SUPPORT 
5 The research leading to the present invention was supported, at least in part, by grants from 
United States Public Health Service Grant Nos. CA57973 and AI31501. Accordingly, the 
Government may have cextain rights in die invention. 

FIELD OF THE INVENTION 
10 The present invention relates to the detenriination of functional HCV virus genomic RNA 
sequences, to construction of infectious HCV DNA clones, and to use of the clones, or 
their derivatives, in therapeutic, vaccine, and diagnostic applications. The invention is also 
directed to HCV vectors, e.g.^ for gene therapy or gene vaccines. 

IS BACKGROUND OF THE INVENTION 

Brief general overview of hepatitis C virus 
After the development of diagnostic tests for hepatitis A virus and hepatitis B virus, an 
additional agent, v^hich could be experimentally transmitted to chin^anzees [Aher et al.. 
Lancet 1, 459-463 (1978); HoUinger et al., Intervirology 10, 60-68 (1978); Tabor et al., 

20 Lancet 1, 463-466 (1978 )], became recognized as the major cause of transfusion-acquired 
hepatitis. cDNA clones corresponding to the causative non-A non-B (NANB) hepatitis 
agent, called hepatitis C vims (HCV), were reported in 1989 [Choo et al., Science 244, 
359-362 (1989 )]. This breakthrough has led to rapid advances in diagnostics, and in our 
understanding of the epidemiology, pathogenesis and molecular virology of HCV (see 

25 Houghton et al., Cia^r Stud Hematol Blood Transfus 61, 1-1 1 (1994) for review). 

Evidence of HCV infection is found throughout the world, and the prevalence of HCV- 
specific antibodies ranges from 0.4-2% in most countries to more than 14% in Egypt 
[Hibbs et al., J. Inf. Dis. 168, 789-790 (1993)]. Besides transmission via blood or blood 
products, or less frequently by sexual and congenital routes, sporadic cases, not associated 

30 with known risk factors* occur and accoiuit for more than 40% of HCV cases [Alter et al., 
J. Am, Med. Assoc. 264, 2231-2235 (1990); Mast and Alter, Semin. Virol. 4, 273-283 
(1993)]. Infections are usually chronic [Alter et al., N. Eng. J. Med. 327, 1899-1905 
(1992)], and clinical outcomes range from an inapparent carrier state to acute hepatitis, 
chronic active hepatitis, and cirrhosis which is strongly associated with die development of 

35 hepatocellular carcinoma. 
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Although interferon {IFN)-a has been shown to be useful for the treatment of a minority of 
patients with chronic HCV infections [Davis et ai, N. Engl. J. Med 321, 1 501-1506 
(1989); DiBisceglie e( al. New Engl J. Med 321, 1506-1510 (1989)] and subunit 
vaccines show some promise in the chimpanzee model [Choo et al,, Proc. Natl. Acad. Sci. 
5 USA 91, 1294-1298 (1994)], future efforts are needed to develop more effective therapies 
and vaccines. The considerable diversity observed among different HCV isolates [for 
review, see Bukh et al., Sem. Liver Dis. 15, 41-63 (1995)], the emergence of genetic 
variants in chronically infected individuals [Enomoto etai, J. Hepatol. 17, 415-416 
(1993); Hijikata et ai, Biochem. Biophys. Res. Comm. 175, 220-228 (1991); Kato etai, 

10 Biochem. Biophys. Res. Comm. 189, 1 19-127 (1992); Kato et al., J. Virol. 67, 3923-3930 
(1993); Kurosaki et ai, Hepatology 18, 1293-1299 (1993); Lesniewski etal., J. Med. 
Virol 40, 150-156 (1993); Ogata et ai, Proc. Natl Acad Sci. USA 88, 3392-3396 (1991); 
Weiner et al. Virology 180, 842-848 (1991); Weiner et al, Proc. Natl Acad Sci USA 89, 
3468-3472 (1992)], and the lack of protective immunity elicited after HCV infection [Farci 

15 et al., Science 258, 135-140 (1992); Prince et ai, J. Infect. Dis. 165, 438-443 (1992)] 
present major challenges towards these goals. 

Molecular Biologv of HCV 
Classification. Based on its genome structure and virion properties, HCV has been 

20 classified as a separate genus in the flavivirus family, which includes two other genera: the 
flaviviruses (e.g., yellow fever (YF) virus) and the animal pestiviruses (e.g,, bovine viral 
diarrhea virus (BVDV) and classical swine fever virus (CSFV)) [Francki et al, Arck Virol 
Suppl. 2, 223 (1991)]. All members of this family have enveloped virions that contain a 
positive-strand RNA genome encoding all known virus-specific proteins via translation of a 

25 single long open reading frame (ORF). 

Structure and physical properties of the virion. Little information is available on the 
structure and replication of HCV. Studies have been hampered by die lack of a cell culture 
system able to support efBcient virus replication and the typically low titers of infectious 
30 virus present in serum. The size of infectious virus, based on filtration experiments, is 
between 30-80 nm [Bradley et al.. Gastroenterology 88, 173-179 (1985); He et ai, J. 
Infect. Dis. 156, 636-640 (1987); Yuasa et al, J. Gen. Virol 72, 2021-2024 (1991)]. 
Initial measurements of the buoyant density of infectious material in sucrose yielded a range 
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of values, with the majority present in a low density pool of < 1.1 g/ml [Bradley et al., J. 
Med. Virol 34,206-208(1991)]. Subsequent studies have used RT/PCR to detect HCV- 
specific RNA as an indirect measure of potentially infectious virus present in sera from 
chronically infected humans or experimentally infected chimpanzees. From these studies, it 

5 has become increasingly clear that considerable heterogeneity exists between different 
clinical samples, and that many factors can affect the behavior of particles containing HCV 
RNA [Hijikata et al,, Virol 67, 1953-1958 (1993); Thomssen et al., Med. Microbiol 
Immunol 181, 293-300 (1992)]. Such factors include association with immunoglobulins 
[Hijikata et al, (1993) supra] or low density lipoprotein [Thomssen et al, 1992, supra\ 

10 Thomssen et al, Med Microbiol Immunol 182, 329-334 (1993)]. In highly infectious 
acute phase chimpanzee serum, HCV-specific RNA is usually detected in fractions of low 
buoyant density (1 .03-1 . 1 g/ml) [Carrick et al., J. Virol Metk 3% 279-289 (1992); 
Hijikata et aL, (1993) supra]. In other samples, the presence of HCV antibodies and 
formation of immune complexes correlate with particles of higher density and lower 

15 infectivity [Hijikata et al, (1993) supra]. Treatment of particles with chloroform, which 
destroys infectivity [Bradley et al, J. Infect. Dis. 148, 254-265 (1983); Feinstone et al.. 
Infect. Immun, 41, 816-821 (1983)], or with nonionic detergents, produced RNA conuining 
particles of higher density (1. 17-1.25 g/ml) believed to represent HCV nucleocapsids 
[Hijikata et al, (1993) supra\ Kanto etal, Hepatology 19, 296-302 (1994); Miyamoto et 

20 al, J. Gen. Virol 73, 715-718 (1992)]. 

There have been reports of negative-sense HCV-specific RNAs in sera and plasma [see 
Yongetal, Journal of Clinical Investigation 88:1058-60 (1991)]. However, it seems 
unlikely that such RNAs are essential components of infectious particles since some sera 
25 with high infectivity can have low or undetectable levels of negative-strand RNA [Shimizu 
et al., Proc. Natl Acad Scl USA 90: 6037-6041 (1993)]. 



The virion protein composition has not been rigorously determined, but putative HCV 
structural proteins include a basic C protein and two membrane glycoproteins. El and E2. 

30 

HCy replication. Early events in HCV replication are poorly understood. Cellular 
receptors for the HCV glycoproteins have not been identified. The association of some 
HCV particles with beta-lipoprotein and immunoglobulins raises the possibility that these 
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host molecules may modulate virus uptake and tissue tropism. Studies examining HCV 
replication have been largely restricted to human patients or experimentally inoculated 
chimpanzees. In tlie chimpanzee model, HCV RNA is detected in the serum as early as 
three days post-inoculation and persists through the peak of serum alanine aminotransferase 
5 (ALT) levels (an indicator of liver damage) [Shimizu et aL, Proc. Natl Acad. ScL USA 87: 
644 1 -6444 (1990)]. The onset of viremia is followed by the appearance of indirect 
hallmarks of HCV infection of the liver. These include the appearance of a cytoplasmic 
antigen [Shimizu et aL, (1990) supra] and ultrastructural changes in hepatocytes such as the 
formation of microtubular aggregates for which HCV previously was referred to as the 

10 chloroform-sensitive "tubule forming agent" or "TFA" [reviewed by Bradley, Prog. Med 
Virol. 37: 101-135 (1990)]. As shown by the appearance of viral antigens [Blight etaL, 
Amer. J. Path. 143: 1568-1573 (1993); Hiramatsu et aL, Hepatology 16: 306-3 11 (1992); 
Krawczynski etaL. Gastroenterology 103: 622-629 (1992); Yamada etaL, Digest. Dis, 
Sci. 38: 882-887 (1993)] and the detection of positive and negative sense RNAs [Fong et 

15 aL, (1991) 5upra\ Gunji et aL, Arch Virol. 134: 293-302 (1994); Haruna et aL, J, 

Hepatol 18: 96-100 (1993); Lamas et at., 1 Hepatol 16: 219-223 (1992); Nouri Aria et 
aL, J. Clin. Inves. 91: 2226-34 (1993); Sherker et aL, J. Med Virol 39: 91-96 (1993); 
Takehara etaL, Hepatology 15: 387-390 (1992); Tanaka etaL, Liver 13: 203-208 
(1993)], hepatocytes appear to be a major site of HCV replication, particularly during acute 

20 infection [Negro et aL, Proc. Natl Acad. Sci. USA 89: 2247-2251 (1992)]. In later stages 
of HCV mfection the appearance of HCV-specific antibodies, the persistence or resolution 
of viremia, and the severity of liver disease, vary greatly both in the chimpanzee model and 
in human patients. Although some liver damage may occur as a direct consequence of 
HCV infection and cytopathogenicity, the emerging consensus is that host inmiune 

25 req)onses, in particular virus-specific cytotoxic T lymphocytes, may play a more dominant 
role in mediating cellular damage. 

It has been speculated that HCV may also replicate in extra-hepatic reservoir(s). In some 
cases, RT/PCR or m situ hybridization has shown an association of HCV RNA with 
30 peripheral blood mononuclear cells including T-cells, B-cells, and monocytes reviewed in 
Blight and Gowans, Viral Hepatitis Rev. 1: 143-155 (1995)]. Such tissue tropism could be 
relevant to the establishment of chronic infections and might also play a role in the 
association bistween HCV infection and certain immunological abnormalities such as mixed 
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cryoglobulinemia [reviewed by Ferri etal, Eur. 1 Clin. Invest 23: 399-405 (1993)J, 
glomerulonephritis, and rare non-Hodgkin's B-lymphomas [Ferri etaL, (1993) supra; 
Kagawae/a/., lawce/ 341:316-317(1993)]. However, the detection of circulating 
negative strand RNA in serum, the difficulty in obtaining truly strand-specific RT/PCR 
(Gunji et aL, (1994) supra], and the low numbers of apparently infected cells have made it 
difficult to obtain unambiguous evidence for replication in these tissues in vivo. 



Genome structure. Full-length or nearly full-length genome sequences of numerous HCV 
isolates have been reported [see Lin et al.. J. Virol, 68: 5063-5073 (1994a); Okamoto et 
10 al.. J. Gen. Virol 75: 629-635 (1994); Sakamoto et al., J. Gen. Virol. 75: 1761-1768 

(1994) and citations therein]. Given the considerable genetic divergence among isolates, it 
is clear that several major HCV genotypes are distributed throughout the world. Those of 
greatest importance in the U.S. are genotype 1, subtypes la and lb (see below and Ref. 
Bukh et al., (1995) supra for a discussion of genotype prevalence and distribution). HCV 
15 genome RNAs are --9.6 kilobases in length (Figure 1). The 5' NTR is 341-344 bases long 
and highly conserved. The length of the long ORF varies slightly among isolates, encoding 
polyproteins of 3010, 3011 or 3033 amino acids. The reported 3' NTR structures show 
considerable diversity both in composition and length (28-42 bases), and appear to 
terminate with poly (U) (see Chen et al.. Virology 188:102-113 (1992); Okamoto et aL, /. 
20 Gen. Virol. 72:2697-2704 (1991); Tokita ^fl/.. /. Gen. ViroL 66:1476-83 (1994)] except 
in one case (HCV-1. type la) which appears to contain a 3' terminal poly (A) tract [Han et 
al, Proc. Natl. Acad. ScL USA 88:171 1-1715 (1991)]- In contrast, our recent analysis 
suggests that the genome RNA of the H-strain (also type la) contains an internal 
polypyrimidine tract followed by a novel RNA element [pending patent application Serial 
25 No. 08/520,678, filed August 29, 1995, and International Patent Application No. 

PCTAJS96/14033, filed August 28, 1996]. The results presented in pending application 
Serial No. 08/520,678 show that the genome RNA of this type la isolate does not terminate 
with a homopolymer tract as previously thought, but rather with a novel sequence of -98 
bases. Furthermore, this 3' NTR structure and the novel 3' terminal element are features 
30 common to all HCV genotypes which have thus far been examined [Kolykhalov et al., J. 
Virol. 70: 3363-3371 (1996); Tanaka a/., BiochenL Biophys. Res. Comm. 215: 744-749 
(1996);Tanakae/a/.. J. F/ro/. 70:3307-12(1996); Yamadae/ a/.. F/rofogy 223:255-261 
(1996)]. 
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Translation and proteolytic processing. Several studies have used cell-free translation and 
transient expression in cell culture to examine the role of the 5' NTR in translation initiation 
[Fukushi etaL, Biochem. Biophys. Res, Comm. 199: 425-432 (1994); Tsukiyama-Kohara 
et aL J. Virol 66: 1476-1483 (1992); Wang et aL, J. Virol 67: 3338-3344 (1993); Yoo et 
5 al, Virology 191:889-899(1992)]. This highly conserved sequence contains multiple 
short AUG-initiated ORFs and shows significant homology with the 5' NTR region of 
pestiviruses [Bukh et at.. Proa. Natl Acad. Sci. USA 89: 4942-4946 (1992); Han et al., 
(1991) supra], A series of stem-loop structures have been proposed on the basis of 
computer modeling and sensitivity to digestion by different ribonucleases [Brown et al., 

10 Nucl Acids Res, 20: 5041-5045 (1992); Tsukiyama-Kohara et at., (1992) suprd\. The 

results from several groups indicate that this element functions as an internal ribosome entry 
site (IRES) allowing efficient translation initiation at the first AUG of the long ORF 
[Fukushi et aL, (1994) supra\ Tsukiyama-Kohara et al, (1992) supra\ Wang et al, (1993) 
supra; Yoo et al, (1992) supra]. Some of the predicted features of the HCV and pestivirus 

15 IRES elements are sunilar to one another [Brown et al, (1992) siqyra]. The ability of this 
element to function as an IRES suggests that HCV genome RNAs may lack a 5' cap 
structure. 

The organization and processing of the HCV polyprotein (Figure 1) appears to be most 
20 similar to that of the pestiviruses. At least 10 polypeptides have been identified and the 
order of these cleavage products in the polyprotein is NH2-C-El-E2-p7-NS2-NS3-NS4A- 
NS4B-NS5A-NS5B-CC)OH. As shown in Figure 1, proteolytic processing is mediated by 
host signal peptidase and two HCV-encoded proteinases, the NS2-3 autoproteinase and the 
NS3-4A serine proteinase [see Rice, In "Fields Virology" (B. N. Fields, D. M. Knipe and 
25 P. M. Howiey, Eds.), Vol. pp. 93 1 -960. Raven Press, New York (1996); Shimotohno et al, 
J. Hepatol 22: 87-92 (1995) for reviews]. C is a basic protein believed to be the viral 
core or capsid protein; El and E2 are putative virion envelope glycoproteins; p7 is a 
hydrophobic protein of unknown function that is inefficiently cleaved from the E2 
glycoprotein [Lin et al., (1994a) supra; Mizushima et al., J. Virol 68: 621 5-6222 (1994); 
30 Selby et aL, Virology 204: 1 14-122 (1994)], and NS2-NS5B are likely nonstructural (NS) 
proteuis which function in viral RNA replication complexes. In particular, besides its N- 
terminal serine proteinase domain, NS3 contains motifs characteristic of RNA helicases and 
has been shown to possess an RNA-stimulated NTPase activity [Suzich et al., J. Virol 67, 
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6152-6158 (1 993)1 ; NS5B contains the GDD motif characteristic of the RN A-dependent 
RNA polymerases of positive-strand RN A viruses. 

HCV RNA replication. By analogy with flaviviruses, replication of tlie positive-sense HCV 
5 virion RNA is thought to occur via a minus-strand intermediate. This strategy can be 
described briefly as follows: (i) uncoating of the incoming virus particle releases the 
genomic plus-strand, which is translated to produce a single long polyprotein that is 
probably processed co- and post-translationally to produce individual structural and 
nonstructural proteins; (ii) the nonstructural proteins presumably form a replication 
10 complex that utilizes the virion RNA as template for the synthesis of minus strands; (iii) 
these minus strands in turn serve as templates for synthesis of plus surands, which can be 
used for additional translation of viral protein, minus strand synthesis, or packaging into 
progeny virions. Very few details about HCV replication process are available, due to the 
lack of a good experimental system for virus propagation. Detailed analyses of authentic 
15 HCV replication and other steps in the viral life cycle would be greatly facilitated by the 
development of an efficient system for HCV replication in cell culture. 

Many attempts have been made to infect cultured cells with serum collected from HCV- 
infected individuals, and low levels of replication have been reported in a number of ceUs 

20 types infected by this method, including B-cell [Bertolini et al.. Res. Virol 144: 281-285 
(1993); Nakajima etaL. J. ViroL 70: 9925-9 (1996); Valli etaL Res. ViroL 146:285-288 
(1995)]. T-cell (Kato et ai, Biochent Biophys. Res, Commm, 206:863-9 (1996); Mizutani 
etal, Biochem. Biophys. Res, Conun, 227:822-826; Mizutani etal., J. Virol. 70: 7219- 
7223 (1996); Nakajima et al., (1996) supra; Shimizu and Yoshikura. / Virol, 68: 8406- 

25 8408 (1994); Shimizu et ai, Proc. Natl Acad, Sci USA, 89: 5477-5481 (1992); Shunizu et 
al. Proc, Natl, Acad, Sci. USA, 90: 6037-6041 (1993)], and hepatocyte [Kato etal., Jpn. 
J. Cancer Res., 87: 787-92 (1996); Tagawa. J. Gastoenterol. and Hepatol, 10: 523-527 
(1995)] cell lines, as well as peripheral blood monocular cells (PBMCs) [Cribier etal.,J. 
Gen. Virol, 76: 2485-2491 (1995)], and primary cultures of human fetal hepatocytes 

30 [Carloni et al.. Arch, Virol Suppl. 8: 31-39 (1993); Cribier et al., (1995) supra; lacovacci 
et al. Res, Virol , 144: 275-279 (1993)] or hepatocytes from adult chimpanzees [ILanford et 
al, Virology 202: 606-14 (1994)]. HCV replication has also been detected in primary 
hepatocytes derived from a human HCV patient that were infected with the virus in vivo 
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prior to cultivation [Ito et al, J. Gen. Virol 77: 1043-1054 (1996)] and in the human 
hepatoma cell line Huh? following transfection with RNA transcribed in vitro from an 
HCV-1 cDNA clone [Yoo et al, J. Virol, 69; 32-38 (1995)]. The reported observation of 
replication in cells transfected with RNA derived from the HCV-1 clone was puzzling, 
5 since this clone lacks the 3'NTR sequence downstream of the homopolymer tract (see 
below). The most well-characterized cell-culture systems for HCV replication utilize a B- 
cell line (Daudi) or T-cell lines persistently infected with retroviruses (HPB-Ma or MT-2) 
[Kato etal, (1995) supra\ Mizutani etal, Biochem Biophys Res. Comm., Ill: 822-826 
(1996a); Mizutani etal, (1996) supra; Nakajima etal, (1996) supra; Shimizu and 

10 Yoshikura, (1994) supra]; Shimizu, Proc. Natl Acad. ScL USA, 90: 6037-6041 (1993)], 
HPBMa is infected with an amphotropic murine leukemia virus pseudotype of murine 
sarcoma virus, while MT-2 is infected with human T-cell lymphotropic virus type I (HTLV- 
I). Clones (HPBMalO-2 and MT-2C) that support HCV replication more efficiently than 
the uncloned population have been isolated for the two T-cell lines HPBMa and MT-2 

15 [Mizutani et al J. Virol (1996) supra; Shimizu et al, (1993) supra]. However, the 
maximum levels of RNA replication obtained in these lines or in the Daudi lines after 
degradation of the input RNA is still only about 5 x 10^ RNA molecules per 10^ cells 
[Mizutani et al, (1996) supra; Mizutani et al, (1996) supra] or 10* RNA molecules per ml 
of culture medium [Nakajima et al, (1996) supra]. Although the level of replication is 

20 low, long-term infections of up to 198 days in one system [Mizutani et al, Biochem, 
Biophys, Res, Comm, 227: 822-826 (1996a)] and more than a year in another system 
[Nakajima et al, (1996) supra] have been documented, and infectious virus production has 
been demonstrated by serial cell-free or cell-mediated passage of the virus to naive cells. 

25 However, efficient HCV replication has not been observed in any of the cell-culture 
systems described to date, and all of the groups that have attempted to establish such 
systems have encountered a number of problems, including the difficulty in distinguishing 
input RNA from plus strands produced by replication, the false detection of minus strands, 
and generally low titers of replicated RNA. Thus, despite these advances, more efficient 

30 cell-culture systems for HCV propagation are needed for the production of concentrated 
virus stocks, structural analysis of virion components, and improved analyses of 
intracellular viral processes, including RNA replication. 
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Virion assembly and release. This process has not been examined directly, but the lack of 
complex glycans, the ER localization of expressed HCV glycoproteins [Dubuisson et al, J. 
Virol 68: 6147-6160 (1994); Ralston ere/., J.Virol 67:6753-6761 (1993)] and the 
absence of these proteins on the cell surface [Dubuisson et al„ (1994) supra; Spaete ei al., 

5 Virology 188: 819-830 (1992)] suggest that initial virion morphogenesis may occur by 

budding into intracellular vesicles. Thus far, efficient particle formation and release has not 
been observed in transient expression assays, suggesting that essential viral or host factors 
are absent or blocked. HCV virion formation and release may be inefficient, since a 
substantial fraction of the virus remains cell-associated, as found for the pestiviruses. A 

10 recent study indicates that extracellular HCV particles partially purified from human plasma 
contain complex N-linked glycans, although these carbohydrate moieties were not shown to 
be specifically associated with El or E2 [Sato et al, Virology 196: 354-357 (1993)1. 
Complex glycans associated widi glycoproteins on released virions would suggest transit 
through the trans-Golgi and movement of virions through the host secretory pathway. If 

15 this is correct, intracellular sequestration of HCV glycoproteins and virion formation might 
then play a role in the establishment of chronic infections by minunizing immune 
surveillance and preventing lysis of virus-infected cells via antibody and complement. 

Genetic variability. As for all positive-strand RNA viruses, the RNA-dependent RNA 
20 polymerase (RDRP) of HCV (NS5B) is believed to lack a 3'-5' exonuclease proof reading 
activity for removal of misincorporated bases. Replication is therefore error-prone, leading 
to a "quasi-species" virus population consisting of a large number of variants [Martell et 
al.l Virol 66: 3225-3229 (1992); Martell CI fl/., J, Virol 68:3425-3436(1994)]. This 
variability is apparent at multiple levels. First, in a chronically infected individual, changes 
25 in the virus population occur over time [Ogata et al, (1991) supra\ Okamoto et al.. 

Virology 190: 894-899 (1992)]; and these changes may have important consequences for 
disease. A particularly interesting example is the N-tcrminal 30 residue segment of the E2 
glycoprotein, which exhibits a much higher degree of variability than the rest of the 
polyprotein [for examples, see Higashi etal. Virology 197, 659-668. 1993; Hijikata etal., 
30 (1991) 5upra\ Weiner et al,, (1991) supra^ There is accumulating evidence that this 

hypervariable region, perhaps analogous to the V3 domain of HTV-l gpl20, may be under 
immune selection by circulating HCV-specific antibodies [Kato et al., (1993) supra\ 
Taniguchiefo/.. Virology 195:297-301 (1993); Weiner el a/., (1992) supra. In this 
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model, antibodies directed against this portion of E2 may contribute to virus neutralization 
and thus drive the selection of variants with substitutions that permit escape from 
neutralization. This plasticity suggests that a specific amino acid sequence in the E2 
hypervariabie region is not essential for other functions of the protein such as virion 
5 attachment, penetration, or assembly. 

Genetic variability may also contribute to the spectrum of different responses observed after 
IFN-a treatment of chronically infected patients. Diminished serum ALT levels and 
improved liver histology, which usually correlates with a decrease in the level of circulating 

10 HCV RNA, is seen in -40% of those treated [Greiser-Wilke et ai, 1 Gen. Virol. 12: 
2015-2019 (1991)]. After treatment, approximately 70% of the responders relapse. In 
some cases, after a transient loss of circulating viral RNA, renewed viremia is observed 
during or after the course of treatment. While this might suggest the existence or 
generation of IFN-resistant HCV genotypes or variants, fiirther work is needed to 

15 determine the relative contributions of virus genotype and host-specific differences in 
immune response. 

Finally, sequence comparisons of different HCV isolates around the world have revealed 
enormous genetic diversity [reviewed in Ref. Bukh et aL, (1995) supra]. Because of the 

20 lack biologically relevant serological assays such as cross-neutralization tests, HCV types 
(designated by numbers), subtypes (designated by letters), and isolates are currently 
grouped on the basis of nucleotide or amino acid sequence similarity. Amino acid sequence 
similarity between the most divergent genotypes can be a little as -50%, depending upon 
the protein being compared. TTiis diversity has unportant biological implications, 

25 particularly for diagnosis, vaccine design, and therapy. 

Attempts bv others to generate infectious HCV transcripts from cDN A 
A recent paper [Yoo et aL, J, Virol. 69: 32-38 (1995)1 reports replication of transcribed 
HCV-1 RNA after transfecdon of Hub? cells. In this paper, T7 transcripts ftom various 
30 derivatives of an HCV-1 cDNA clone were tested for their ability to replicate following 
transfecdon of the human hq)atoma cell line, Huh7. Possible HCV replication was 
assessed by strand-specific RT/PCR (using 5' NTR pruners) and metabolic labeling of 
HCV-specific RNAs with 'H-uridine. Apparently ftilHength transcripts, terminating with 
either poly (A) or poly (U), were positive by these assays, but those with a deletion of the 
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5' terminal 144 bases were not. In some cultures, HCV-specific RNA was detected in the 
culture media and this putative virus was used to reinfect fresh Huh7 cells. 



The present inventors have been unable to reproduce these results. It appears that this 

5 report describes transient replication, rather than authentic HCV infection, with replication 
and virus production. Some of the data appear self-contradictory. For instance, the 
positive control reported in this paper was productive transfection of Huh7 cells with RNA 
extracted from 1 ml of high HCV titer chimpanzee plasma. This extracted sample would 
contain a maximum of 10' potentially infectious full-length HCV RNA molecules. Under 

10 optimum transfection conditions (other than microinjection), greater than 10^ RNA 
molecules of virion RNA (at least for poliovirus, Sindbis virus, or YF) are typically 
required to initiate a single infectious event. This suggests that in the reported HCV-1 
experiment fewer than 100 cells would be productively transfected. Furdiermore, at 16 
days post-transfection, both positive- and negative-strand RNAs were reportedly detected 

15 after eight hours of metabolic labeling. The detection of negative-strand RNA by this 

method (both for transfected virion RNA and transcript RNA) suggests that HCV is capable 
of both efficient replication and spread, and that the level of HCV RNA synthesis is similar 
to that which would be expected for a more robust flavivirus, such as YF (at the peak of a 
high multiplicity infection). Yet Yoo et al did liot report detection of HC:V antigens in 

20 these cells using a variety of antisera, nor were they able to report detection of full-length 
positive- or negative-strands by Northern analysis (which is much more sensitive than 
metabolic labeling with ^H-uridine). Finally, the critical experiment, demonstrating that 
RNA or virus derived from the HCV-1 clone is infectious in the chimpanzee model, has not 
been reported. 

25 

TiTi pnrtance nf Infectious Clone Tec hnolo^ for HCV Research 
Despite the great deal of progress made in the last several years a vast number of questions 
concerning HCV replication, pathogenesis, and hnmunity remain unanswered. The field is 
rapidly reaching a bottleneck where we understand some aspects of the functions of the 
30 HCV RNA genome and its encoded proteins, but have no way of experimentally testing 
structure/function questions in the context of authentic virus replication. Such analyses are 
critical for understanding each step in the virus life cycle to enable the design of protective 
vaccines, effective therapy, and HCV diagnostics. 



wo 98/39031 ^CT/US9«/04428 

12 

Thus, there is a need in the art for authentic HCV genetic material for expression of 
infectious HCV RNA, 



There is a further need in the art for authentic genetic material for expression of native 
5 HCV virions and viral particle proteins, which can, in turn, permit characterization of HCV 
virion structure. 

The art also requires an in vitro culture method for infectious HCV, which would permit 
analysis of HCV receptor binding, cellular infection, replication, virion assembly, and 
10 release. 

These and other needs in the art are addressed by the present invention. 

The citation of any reference herein should not be construed as an admission that such 
15 reference is available as "Prior Art" to the instant application. 

SUMMARY OF THE INVENTION 
The present invention advantageously provides an authentic h^atitis C virus (HCV) DNA 
clone capable of replication, expression of functional HCV proteins, and infection in vivo 
20 and in vitro for development of antiviral therapeutics and diagnostics. 

In a broad aspect, the present Invention is directed to a genetically engineered hepatitis C 
virus (HCV) nucleic acid clone which comprises from S' to 3' on the positive-sense nucleic 
acid a functional S' non-translated region (NTR) comprising an extreme S '-terminal 

25 conserved sequence, an open reading frame (ORF) encoding at least a portion of an HCV 
polyprotein whose cleavage products form functional components of HCV virus particles 
and RNA rq)lication machinery, and a 3' non-translated region (NTR) comprising an 
extreme S'-terminal conserved sequence, or a derivative thereof selected from the group 
consisting of adapted virus, live-attenuated virus, replication-conq)etent non-infectious 

30 virus, and defective virus. It has been found by the present inventors (hat various 

manipulations-, effected using genetic engineering techniques, are required to produce an 
authentic HCV nucleic acid, e.g. , a cDNA that can be transcribed to produce infectious 
HCV RNA, or an infectious HCV RNA. By providing engineered authentic HCV nucleic 
acids, the present inventors have for the first time enabled dissection of HCV replication 
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machinery and protein activity, and preparation of various HCV derivatives. Previously, 
since there was uncertainty about whether any given HCV clone contained an error or 
mutation that led to its inability to function, one could not be certain that starting material 
for further analysis of HCV was useful or simply due to an artifact- Thus, a major 
5 advantage of the present invention is that it provides authentic HCV. thus assuring that any 
modifications result in real changes rather than artifacts due to errors in the clones provided 
in the prior art. 

10 A further advantage of the present invention is recognition of the characteristics of an 
infectious HCV genome, particulariy in the polyprotein coding region. In a specific 
embodiment, the HCV nucleic acid has a consensus nucleic acid sequence determmed from 
the sequence of a majority of at least three clones of an HCV isolate or genoftrpe. 
Preferably, the HCV nucleic acid has at least a functional portion of a sequence as shown in 

15 SEQ ID NO: 1 , which represents a specific embodiment of the present invention exemplified 
herein. It should be noted that while SEQ ID NO: I is a DNA sequence, the present 
invention contemplates the corresponding RNA sequence, and DNA and RNA 
complementary sequences as well. In a further embodiment, a region from an HCV isolate 
is substituted for a homologous region, e.g., of an HCV nucleic acid having a sequence of 

20 SEQ ID NO: 1 . In a further preferred embodhnent, exempUfied herein, the HCV nucleic 
acid is a DNA that codes on expression for a replication-competent HCV RNA replicon, or 
is itself a replication-competent HCV RNA replicon. In a specific example, infra, an HCV 
nucleic acid of the invention has a full length sequence as depiaed m or corresponding to 
SEQ ID N0:1. Various modifications of tiie 5' and 3' are also contemplated by the 

25 invention. For example, the 5'-terminal sequence can be homologous or complementary to 
an RNA sequenee selected from the group consisting of GCCAGCC; GGCCAGCC; 
UGCCAGCC; AGCCAGCC; AAGCCAGCC; GAGCCAGCC; GUGCCAGCC; and 
GCGCCAGCC, wherein the sequence GCCAGCC is the 5'-terminus of SEQ ID NO:3. 

30 Still another advantage of the present invention is tiie demonstration of the importance of 
the complete 3'-NTR for an infectious HCV clone. The 3'-NTR, particularly the 
approximately 98 base extreme terminal sequence, which is highly conserved among HCV 
genotypes, is the subject of U.S. Patent Application Serial No. 08/520,678, filed August 
29, 1995, which is incorporated herein by reference in its entirety; and PCT International 
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Application No. PCT/US96/14033, filed August 28, 1996, which is also incorporated 
herein by reference in its entirety. Thus, in a preferred aspect, the functional 3'-NTR 
comprises a 3 '-terminal sequence of approximately 98 bases that is highly conserved among 
HCV genotypes. In a specific embodiment, the 3'-NTR extreme terminus is homologous or 
5 complementary to a DNA having the sequence 

S'-GGTGGCTCCATCTTAGCCCTAGTCACGGCTAGCTGTGAAAGGTCCGTGAGCCG 
CATGACTGCAGAGAGTGCTGATACTGGCCTCTCTGCTGATCATGT-3' (SEQ ID 
NO:4). In a specific embodiment, exemplified in SEQ ID N0:1, the 3'-NTR comprises a 
long poly-pyrimidine region {e.g., about 133 bases); however, alternative length poly- 
10 pyrimidine regions are also encompassed, including short regions (about 75 bases), or 
regions that are shorter or longer. Naturally, in a positive strand HCV DNA nucleic acid, 
the poly-pyrimidine region is a poly(T/TC) region, and in an positive strand HCV RNA 
nucleic acid, the poly-pyrimidine region is a poly(U/UC) region. 

According to various aspects of the invention, and HCV nucleic acid, including the 
polyprotein coding region, can be mutated or engineered to produce variants or derivatives 
with, e.g., silent mutations, conservative mutations, etc. Such clones may also be adapted, 
e,g., by selection for propagation in animals or in vitro. The present invention further 
permits creation of HCV chimeras, in which portions of the genome for other genotypes or 
isolates are substituted for the homologous region of an HCV clone, such as SEQ ID N0:1 
or the deposited embodiment, infra. In still other embodiments, the invention provides 
methods for preparing, and clones comprising, polyprotein coding sequence from an HCV 
genotype selected from the group consisting of the HCV-1, HCV-la, HCV-lb, HCV-lc, 
HCV-2a, HCV-2b, HCV-2c, HCV-3a, and any "quasi-species" variant thereof . In a 
further preferred aspect, silent nucleotide changes in the polyprotein coding regions {i.e. , 
variations of the third base of a codon that encodes the same amino acid) are incorporated 
as markers of specific HCV clones. 

In a furdier aspect of the invention, an HCV nucleic acid« including attenuated and 
30 defective variants thereof, can comprise a heterologous gene operatively associated with an 
expression control sequence, wherein the heterologous gene and expression control 
sequence are oriented on the positive-strand nucleic acid molecule. In a specific 
embodiment, the heterologous gene is inserted by a strategy selected from the group 
consisting of in-frame fusion with the HCV polyprotein coding sequence; and creation of an 
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additional cisuon. The heterologous gene can be an antibiotic resistance gene or a reporter 
gene. Alternatively, the heterologous gene can be a therapeutic gene, or a gene encoding a 
vaccine antigen, Le, , for gene therapy or gene vaccine applications, respectively. In a 
specific embodiment, where the heterologous gene is an antibiotic resistance gene, the 
5 antibiotic resistance gene is a neomycin resistance gene operatively associated with an 
internal ribosome entry site (IRES) inserted in an S/fl site in the 3'-NTR. 

Nauirally, as noted above, the HCV nucleic acid of the invention is selected from the group 
consisting of double stranded DNA, positive-sense cDNA, or negative-sense cDNA, or 
10 positive-sense RNA or negative-sense RNA. Thus, where particular sequences of nucleic 
acids of the invention are set forth, both DNA and correspondmg RNA are intended, 
including positive and negative strands thereof. 

An HCV DNA may be inserted in a plasmid vector for translation of the corresponding 

15 HCV RNA. Thus, the HCV DNA may comprise a promoter 5' of the 5'-NTR on positive- 
sense DNA, whereby transcription of template DNA from the promoter produces 
replication-competent RNA. The promoter can be selected from the group consisting of a 
eukaryotic promoter, yeast promoter, plant promoter, bacterial promoter, or viral 
promoter. In specific exanu)les, ir0-a, phage T7 and SP6 promoters are employed, in a 

20 specific embodiment, the present invention is directed to a plasmid clone, p90/HCVFL 
[long poly(U)]. harboring a fuU-Iength HCV cDNA which can be transcribed to produce 
infectious HCV RNA transcripts as deposited with the American Type Culture Collection 
(ATCC), 12301 Parklawn Drive, Rockville, Maryland 20852, USA on February 13, 1997, 
and assigned accession no. 97879, having a sequence as depicted in SEQ ID N0:5. 

25 Naturally, the invention also includes a derivative of this plasmid, selected from the group 
consisting of a derivative wherein a 5 '-terminal sequence is homologous or complementary 
to an RNA sequence selected from the group consisting of GCCAGCC, GGCCAGCC, 
UGCCAGCC, AGCCAGCC, AAGCCAGCC, GAGCCAGCC. GUGCCAGCC, and 
GCGCCAGCC, wherein the sequence GCCAGCC is the 5'-termimis of SEQ ID N0:3; and 

30 a derivative wherein a 3'-NTR comprises a short poly-pyrimidine region (since the 

deposited embodiment has a long poly-pyrimidine region, which may be preferred). In a 
further embodiment, a derivative of the deposited embodiment may be selected from the 
group consisting of a derivative produced by substitution of homologous regions from other 
HCV isolates or genotypes; a derivative produced by mutagenesis; a derivative selected 
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from the group consisting of adapted, live-attenuated, replication competent non-infectious, 
and defective variants; a derivative comprising a heterologous gene operatively associated 
with an expression control sequence; and a derivative consisting of a functional fragment of 
any of the above-mentioned derivatives. Alternatively, portions of the deposited DNA 
5 clone, such as the 5' NTR, the polyprotein coding regions, the 3*-NTR or more generally 
any coding or non-translated region of the HCV genome, can be substituted with a 
corresponding region from a different HCV genotype to generate a new chimeric infectious 
clone, or by extension, infectious clones of other isolates and genotypes. For example, an 
HCV- lb or -2a polyprotein coding region (or consensus polyprotein coding regions) can be 
10 substituted for the HCV-H (la strain) polyprotein coding region of the deposited clone. 

Naturally, the present invention further provides an HCV DNA or RNA transcribed from 
the full length HCV cDNA harbored in the plasmid clones set forth above. 

15 Thus, the specific HCV genome itself provides an excellent starting material for deriving 
modified variants of HCV, since any modifications will result from changes to authentic 
virus, rather than artifacts resulting from an accumulation of changes and errors. The HCV 
DNA clones or RNAs of the invention can be used in numerous methods, or to derive 
authentic HCV components, as set forth below. 

20 

For example, the invention provides a method for identifying a cell line that is permissive 
for infection with HCV, comprising contacting a cell line in tissue culture with an infectious 
amount of HCV RNA, e.g, , as produced from the plasmid clones recited above, and 
detecting replication of HCV in cells of the cell line. Naturally, the invention extends as 

25 well to a method for identifying an animal that is permissive for infection with HCV, 
comprising introducing an infectious amount of the HCV RNA, e.g., as produced by the 
plasmids above, to the animal, and detecting replication of HCV in the animal. By 
providing authentic infectious HCV, preferably comprising a dominant selectable marker, 
the invention further provides a method for selecting for HCV with adaptive mutations that 

30 permit higher levels of HCV rq>lication in a permissive cell line or animal comprising 
contacting a cell line in culture, or introducing into an animal, an infectious amount of the 
HCV RNA. and detecting progressively increasing levels of HCV RNA ui the cell line or 
the animal. In a specific embodiment, the adaptive mutation permits modification of HCV 
tropism. An immediate implication of this aspect of the invention is creation of new valid 
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animal models for HCV infection. 



The permissive cell lines or animals that are identified using the nucleic acids of the 
invention are very useful, inter alia, for studying the natural history of HCV infection, 
5 isolating functional components of HCV, and for sensitive, fast diagnostic applications, in 
addition to producing authentic HCV virus or components thereof. As noted above, a 
particular advantage of the invention is that is represents the first successful preparation of 
an HCV DNA clone capable of initiating a productive infection in animals or cell lines. 

10 Because the HCV DNA, e.g. , plasmid vectors, of the invention encode authentic HCV 
components, expression of such vectors in a host cell line transfected, transformed, or 
transduced with the HCV DNA can be effected. For example, a baculovirus or plant 
expression system can be harnessed to express HCV virus particles or components thereof. 
Thus, a host cell line may be selected from the group consisting of a bacterial cell, a yeast 

15 cell, a plant cell, an insect cell, and a mammalian cell. 

Because the invention provides, inter alia, infectious HCV RNA, the invention provides a 
method for infecting an animal with HCV which comprises administering an infectious dose 
of HCV RNA, such as the HCV RNA transcribed from the plasmids described above, to 
20 the animal. Naturally, the invention provides a non-human animal infected with HCV of 
the invention, which non-human animal can be prepared by the foregoing methods. 

A further advantage of the present invention is that, by providing a complete functional 
HCV genome, authentic HCV viral particles or components thereof, which may be 

25 produced with native HCV protems or RNA in a way that is not possible in subunit 
expression systems, can be prepared. In addition, since each component of HCV of the 
invention is functional (thus yielding the authentic HCV), any specific HCV component is 
an authentic component, i.e. , lackmg any errors that may, at least in part, affect the clones 
of the prior art. Indeed, a further advantage of the invendon is the ability to generate HCV 

30 virus particles or virus particle proteins that are structurally identical to or closely related to 
natural HCV virions or protems. Thus, in a further embodiment, the invention provides a 
method for propagating HCV in vitro comprising culturing a cell Ime contacted with an 
infectious amtount of HCV RNA of the invention, e,g. , HCV RNA translated from the 
plasmids described above, under conditions that permit replication of the HCV RNA. 
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Naturally, the invention extends to an in vitro cell line infected with HCV. wherein the 
HCV has a genomic RNA sequence as described above. In a specific embodiment, the cell 
line is a hepatocyte cell line. The invention further provides various methods for producing 
HCV virus particles, including by isolating HCV virus particles from the HCV-infected 
5 non-human animal of invention; culturing a cell line of the invention under conditions that 
permit HCV replication and virus particle formation; or culturing a host expression cell line 
transfected with HCV DNA under conditions that permit expression of HCV particle 
proteins; and isolating HCV particles or particle proteins from the cell culture. The present 
invention extends to an HCV virus particle comprising a replication-competent HCV 
10 genome RNA, or a replication-defective HCV genome RNA. corresponding to an HCV 
nucleic acid of the invention as well. 



By providing for insertion of heterologous genes in the HCV nucleic acids, e,g,, DNA or 
RNA vectors, the present invention provides a method for transducing an animal susceptible 
15 to HCV infection with a heterologous gene, e.g. , for gene therapy or gene vaccination, by 
administering an amount of the HCV RNA to the animal effective to infect the animal with 
the HCV RNA. In a specific embodiment, such an HCV vector is generated in HCV 
harbored in the plasmids, described above. 



20 Also provided is an &i vitro cell-free assay system for HCV comprising HCV genomic 
template RNA of the invention, e,g,, as transcribed from a plasmid of the invention as set 
forth above, functional HCV replicase components, and an isotonic buffered medium 
comprising ribonucleotide triphosphate bases. These elements provide the replication 
machinery and raw materials (NTPs). 

25 

The authentic HCV viral particles and viral particle proteins are a preferred starting 
material as HCV antigens. Thus, in a further embodiment, the invention provides a method 
for producmg antibodies to HCV comprising administermg an immunogenic amount of 
HCV virus particles to an animal, and isolatmg anti-HCV antibodies from the animal. Such 
30 antibodies may be used diagnostically. e.g. , to detect the presence of HCV, or they may be 
used therapeutically, e.g., in passive immunotherapy. A further method for producing 
antibodies to HCV comprises screening a human antibody libraiy for reactivity with HCV 
virus particles of the invention and selecting a clone from the library that expresses an 
antibody reactive with die HCV virus particle. Naturally, in addition to generating 
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antibodies, the authentic HCV viral particles and proteins of the invention represent 
preferred starting materials for an HCV vaccine. Preferably, a vaccine of the invention 
includes a pharmaceutically acceptable adjuvant. 

5 The authentic materials provided herein provide a method for screening for agents capable 
of modulating HCV replication in vitro and in vivo. Such methods include administering a 
candidate agent to an HCV infected animal of the invention, and testing for an increase or 
decrease in a level of HCV infection or activity compared to a level of HCV infection or 
activity in the animal prior to administration of the candidate agent, wherein a decrease in 

10 the level of HCV infection or activity compared to the level of HCV infection or activity in 
the animal prior to administration of the candidate agent is indicative of the ability of the 
agent to inhibit HCV infection or activity. Testing for the level of HCV infection can be 
performed by measuring viral titer in a tissue sample from the animal; measuring viral 
proteins in a tissue sample from the animal; or measuring liver enzymes. Alternatively, the 

15 HCV genome used to infect the animal may include a heterologous gene operatively 
associated with an expression control sequence, wherein the heterologous gene and 
expression control sequence are oriented on the positive-strand nucleic acid molecule, and 
testing for the level of HCV activity comprises measuring the level of a marker protein in a 
tissue sample from the animal. 

20 

Alternatively, such analysis can proceed in vitro, e.g. , by contacting the cell line of claim 
32 with a candidate agent; and testing for an increase or decrease in a level of HCV 
infection or activity compared to a level of HCV infection or activity in a control cell line 
or in the cell line prior to administration of the candidate agent; wherein a decrease in the 

25 level of HCV infection or activity compared to the level of HCV infection or activity in a 
control cell line or in the cell line prior to administration of the candidate agent is indicative 
of the ability of the agent to inhibit HCV infection or activity. Testing for the level of HCV 
infection in vitro can be performed by measuring viral titer in the cells, culture medium, or 
both; and measuring viral proteins in the cells, culture medium, or both. Alternatively, 

30 when the HCV genome used to infect the cell line includes a heterologous gene operatively 
associated with an expression control sequence, wherein the heterologous gene and 
expression control sequence are oriented on the positive-strand nucleic acid molecule, and 
testing for the level of HCV activity comprises measuring the level of a marker protein in a 
tissue sample from die animal. 
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A further method for screening for agents capable of modulating HCV replication involves 
the cell free system described above. This method comprises contacting the in vitro system 
of the invention with a candidate agent; and testing for an increase or decrease in a level of 
HCV replication compared to a level of HCV replication in a control cell system or system 
5 prior to administration of the candidate agent; wherein a decrease in the level of HCV 
replication compared to the level of HCV replication in a control ceil line or in the cell line 
pridr to administration of the candidate agent is indicative of the ability of the agent to 
inhibit HCV infection or activity. 

10 The invention includes a method for preparing an HCV nucleic acid comprising joining 
from 5' to 3' on the positive-sense DNA a functional 5' non-translated region (NTR) 
comprising an extreme 5 '-terminal conserved sequence, a polyprotein coding region 
encoding HCV proteins that provide for expression of functional HCV proteins, and a 3' 
non-translated region (NTR) comprising an extreme 3'-terminal conserved sequence. The 

15 method may further comprise determining a consensus sequence for the 5'-NTR, 

polyprotein coding sequence, and 3'-NTR from a majority sequence of at least three clones 
of an HCV isolate or genotype. In a specific embodiment, the 3'-NTR comprises an 
extreme termmal sequence homologous to a DNA having the sequence 
5'-GGTGGCTCCATCTTAGCCCTAGTCACGGCTAGCTGTGAAAGGTCCGTGAGCCG 

20 CATGACTGCAGAGAGTGCTGATACTGGCCTCTCTGCTGATCATGT-3' (SEQ ID 
NO:4). In a farther specific embodiment, the HCV nucleic acid has a positive strand 
sequence as depicted in or corresponding to SEQ ID NO: I comprising substittition of a 
homologous region from another HCV isolate or genotype. 

25 The present invention also has significant diagnostic implications. In one embodiment, the 
invention provides an in vitro method for detecting antibodies to HCV in a biological 
sample from a subject comprisii^g contacting a biological sample fi-om a subject with HCV 
virus particles of the invention, e.g. . prq)ared as described above, under conditions diat 
permit binding of HCV-specific antibodi^ in the sanq)le to the HCV virus particles; and 

30 detecting binding of antibodies in the sample to the HCV virus particles, wherein detecting 
binding of antibodies in die sample to the HCV virus particles is indicative of the presence 
of antibodies to HCV in the sample. 



An alternative in vitro method for detecting the presence of HCV in a biological sample 
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from a subject comprises contacting a cell line permissive for productive HCV infection 
with a biological sample, wherein the cell line has been modified to contain a transgene that 
express a reporter gene product expressed under control of a trans-acting factor produced 
by HCV; and detecting expression of the reporter gene product, wherein detection of 

5 expression of the reporter gene product is indicative of the presence of HCV in the 

biological sample from the subject. In a related embodiment, the invention provides an in 
vitro method for detecting the presence of HCV in a biological sample from a subject 
comprising contacting a cell line permissive for productive HCV infection with a biological 
sample, wherein the cell line has been modified to contain a defective virus transgene, 

10 which defective virus transgene will express a reporter gene product at high levels under 
control of a trans-acting factor produced by HCV; and detecting expression of the reporter 
gene product, wherein detection of expression of the reporter gene product is indicative of 
the presence of HCV in the biological sample from the subject. Thus, a significant 
advantage of the present invention is in providing permissive (or susceptible) cell lines for 

15 these in vitro diagnostics. The method according to claim 64, wherein the defective viral 
transgene produces an engineered alphavirus, the trans-acting helper factor is alphavirus 
nsP4 polymerase, andAvheieiD the alphavirus nsP4 polymerase is expressed as a chimeric 
fusion protein with HCV NS4A, such that the alphavirus nsP4 poIymerase-HCV NS4A 
chimeric fusion protein is cleaved by HCV NS3 proteuiase to release functional alphavirus 

20 nsP4 polymerase. In the foregoing mediods, the biological sample is selected from the 
group consisting of blood, serum, plasma, blood cells, lymphocytes, and liver tissue 
biopsy. 

In a related aspect, the invention also provides a test kit for HCV comprising authentic 
25 HCV virus components, and a diagnostic test kit for HCV comprising components derived 
from an authentic HCV virus. 

Thus, a primary object of the present invention has been to provide a DNA encoding 
infectious HCV. 

30 

A related object of the invention is to provide infectious HCV genomic RNA from DNA 
clones. 

Still another object of the invention is to provide attenuated HCV DNA or genomic RNA 
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suitable for vaccine development, which can invade a cell but fails to propagate infectious 
virus. 

Another object of the invention is to provide in vitro and in vivo models of HCV infection 
5 for testing anti-HCV (or antiviral) drugs, for evaluating drug resistance, and for testing 
attenuated HCV viral vaccines. 

Still another object of the invention is to provide for expression of HCV virions or virus 
particle proteins that can be used to identify the HCV receptor, receptor binding 
10 antagonists, and in neutralization assays. In addition, expressed HCV virions or virus 
particle proteins can be used to develop more effective HCV vaccines, with antigens that 
are structurally identical to or closely related to native HCV. 

A further object of the present invention is to provide HCV diagnostics based on the ability 
15 to detect infectious HCV using engineered reporter cells. 

Yet another object is to provide authentic viral antigens, particularly viral particles, to assay 
for HCV-specific antibodies or generate HCV-specific antibodies. 

20 These and other objects of the present invention will be elaborated by the drawings and the 
Detailed Description of the Invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIGURE 1 (PRIOR ART). HCV genome structure, pofyprotein processing, and protein 
25 features. At the top is depicted the viral genome with the structural and nonstructural 

protein coding regions, and the 5'and 3' NTRs, and the putative 3' secondary structure. 

Boxes below the genome indicate proteins generated by the proteolytic processing cascade. 

Putative structural proteins are indicated by shaded boxes and the nonstructural proteins by 

open boxes. Contiguous stretches of uncharged amino acids are shown by black bars. 
30 Asterisks denote proteins with N-linked glycans but do not necessarily indicate the position 

or number of sites utilized. Cleavage sites shown are for host signalase (4), the NS2-3 

proteinase (curved arrow), an the NS3-4A serine protease (I). 

FIGURE 2. Strategies for expression of heterologous RNAs and proteins using HCV 
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vectors. At the top is a diagram of the positive-polarity RNA virus HCV, which expresses 
mature viral proteins by translation of a single long ORF and proteolytic processing. The 
regions of the polyprotein encoding the structural proteins (STRUCTURAL) and the 
nonstructural proteins (REPLICASE) are indicated as lightly-shaded and open boxes, 
5 respectively. Below are shown a number of proposed replication-competent '^replicon" 
expression constructs. The first four constructs (A-D) lack structural genes and would 
therefore require a helper system to enable packaging into infectious virions. Constructs E- 
G would not require helper functions for replication or packaging. Darkly shaded boxes 
indicate heterologous or foreign gene sequences (FG). Translation initiation (aug) and 
10 termination signals (trm) are indicated by open triangles and solid diamonds, respectively. 
Internal ribosomes entry sites (IRES) are shown as boxes with vertical stripes. Constructs 
A and H illustrate the expression of a heterologous product as an in-frame fusion with the 
HCV polyprotein. Such protein fusion junctions can be engineered such that processing is 
mediated either by host or viral proteinases (indicated by the arrow). 

15 

FIGURES. Engineered cell lines for^assaying HCV infection. Panel A. Depicts a cells 
expressing the three silent transgenes. Driven by nuclear promoter elements are: (i) an 
mRNA expressing a polyprotein protein consisting of HCV NS4A fused to Sindbis virus 
(Sin) nonstructural protein 4 (nsP4), (ii) a defective Sindbis virus replicon lacking the nsP4 

20 coding region but a subgenomic promoter (arrow) driving expression of a reporter gene 
(black box), (iii) a defective Sindbis virus RNA lacking the nsPS but containing a ubiquitin- 
i!sP4 fusion gene under the control of the subgenomic RNA promoter. The Sindbis 
replicton and defective RNA contain all the signals necessary for Suidbis virus-specific 
RNA replication, transcription and packaging signals (stem loop structure), but are silent in 

25 the absence of active iisP4. Panel B. Upon produaive infection of a susceptible cells by 
HCV, the virus is uncoated, translated and begins replication (step 1). This results in the 
production of active NS3 serine proteinase (step 2) which cleaves at the HCV NS4A- 
Sindbis nsP4 junction (step 3) to produce active nsP4. nsP4 assembles with the other three 
Sindbis nsPs to form an active Sindbis replication complex (step 4) which can replicate both 

30 Sindbis specific RNAs and lead to transcription from the Sindbis virus subgenomic 

promoters (step 5). Ub-nsP4 expressed from the subgenomic RNA of the defective RNA is 
cleaved to form a more active form of the nsP4 polymerase which further amplifies 
replication and transcription of the Sindbis-specific RNAs (step 6). This leads to high levels 
of reporter gene expression (step 7). 
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FIGURE 4. Initial set of constructs tested in the chimpanzee model (chimpanzee experiment 
I), Clones tested in the chimpanzee model before the correct HCV 5 'and 3' termini had 
been cloned and determined. Diagrams indicate the T7 or SP6 promoter elements, the 
HCV cDNA, and the run-off sites used for production of transcripts terminating with either 
5 poly (A) or poly (U). 



FIGURES (AandB). (A) Regions of HCV H77 amplified for the combinatorial library. 
At the top, a diagram of the HCV H cDN A is shown with the restriction sites used for 
cloning the combinatorial library (Kphl and Notl: open box) indicated. The region was 

10 cloned into a recipient vector, pTET/HCVABgIII/5' 4-3' corr. This recipient vector 
contains HCV H77 consensus sequences for the 5 'and 3' terminal regions, as shown in 
black. Approximate protein boundaries are also indicated. Below, fragments amplified by 
RT-PCR from HCV H77 RNA are denoted as A through G. The number above each 
segment refers to the minimum complexity of the region in the library. Primer pairs and 

15 exact positions are given in Tables 2 & 3. (B) Intermediate and final fragments in the 

assembly of the combinatorial library. As detailed in Tables 2 and 3, infra, intermediates in 
. the assembly PCR process and their approximate locations in the HCV cDNA are shown. 

FIGURE 6. Assembly PCR method, A general scheme of the assembly PCR method is 
20 shown. Specific HCV fragments and primers used in assembly are listed in Table 3. 

FIGURE 7, Exanpleof complexity determination by PCR of cDNA dilutions. For 
amplified regions A, D, and G, different dilutions of first-strand cDNA were checked for 
successful amplification by PCR. Products were analyzed on an agarose gel. From this 
25 analysis, the minimum complexity for these regions in the combinatorial library was 80» 10 
and 10 molecules of cDNA, respectively. 



FIGURE 8 (A and B). Analysis of transcription efficiency through long poly (U/UC) tracts, 
Usmg conditions for optinial transcription of HCV RNAs in vitro, transcription products 
30 from several tenq)late DNAs are shown. (A) Lane 1, supercoiled pTET/HCVFL CMR/5' 
3' corr. DNA; lane 2, Xmnl-digested pTET/HCVFL CMR/5'3' corr. template (predicted 
size 11740 bases); lane 3, Hpa I-digested pTET/HCVFL CMR/5' 3' corr. template 
(predicted size -9600 bases); lanes 4 and 5. transcribed RNA size markers of 1 1 J50 and 
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9400 bases, respectively. Transcription reactions contained 3 mM UTP and 1 mM A.G, 
and CTP. (B) Lane 1, &/7iI-digested p92/HCVFLlong pU/5'GG DNA (predicted size 
-9600 bases); lane 2, Xbal-d\gesttd p92/HCVFLlong pU/5'GG DNA (predicted. size 
- 13000 bases). Transcription reactions in panel B contained all four NTPs at 3mM. In 
5 both panels, HCV RNA transcripts terminating in the poly (U/UC) tract would be -9500 
bases in length. Lanes M in both panels are //i/z^/III-digested lambda DNA size markers. 



FIGURE 9. Sequence alignment for determination of the HCV H77 consensus sequence. 
An alignment of the HCV H sequences determined is shown. The nucleotide and amino 

10 acid sequences at the bottom of each block are for the HCV H CMR prototype sequence. 
Numbers of the sequenced clones from the combinatorial library are indicated at the left 
(SEQ ID NOS:19, 20. GenBank refers to the HCV-H sequence determined by Inchaupe et 
ai[Proc. Natl. Acad. ScL USA 88:10292, 1991; Accession* M67463]. "cons." indicates 
the HCV H77 consensus sequence [SEQ ID N0:1]. Positions identical to the HCV H CMR 

15 sequence are indicated by dots; gaps in certain sequences by dashes. Where differences 
were found, lower case letters indicate silent nucleotide substitutions; upper case letters 
indicate that a particular nucleotide substitution results in a coding change. 

FIGURE 10. Steps in the directed construction of the consensus clone. The diagram 
20 indicates the region of each sequenced clone used for directed construction of the consensus 
clone. Primary fragments from each clone are indicated by hatched boxes, intermediate 
assembly subclones as open boxes, and the final clones and regions used for assembly of 
the fiill-lengfli consensus clone as shaded boxes. Table 4 summarizes the details of the 
cloning steps. 

25 

FIGURE 1 1 . Features/markers of the ten full-length clones tested in chimpanzee 
experiment III. At the top is a schematic of the HCV H77 cDNA consensus RNA. The ten 
RNA transcripts used for the successfiil chimpanzee inoctilation experiment are diagramed 
below. Additional 5' nucleotides and "short" versus "long'' poly (U/UC) tracts are 
30 indicated. All clones/transcripts included two silent nucleotide substitutions as markers: 
position 899 (C instead of T; indicated by asterisks); and position 5936 (C instead of A; 
indicated by circled asterisks). Clones with additional 5' bases contained a mutation 
inactivating the Xho\ site at position 514 (triangle). Clones with -shorf* versus "long" poly 
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(U/UC) tracts were distinguished by A (black dot) versus G at position 8054, respectively. 

FIGURE 12. Serum samples from inoculated animals do not contain carryover template 
DNA, As shown, duplicate RNA samples (from 10 fi\ serum) from the indicated weeks 
5 post-inoculation without (lane 1) or with 10^ (lanes 2-7) or 10^ (lanes 8-14) molecules of 
added competitor RNA were amplified by RT-PCR with (+) or without (-) enzyme in the 
reverse transcription step [Kolykhalov etaL, J, Virol. 70:3363 (1996)]. No specific PGR 
band was detected in the absence of cDNA synthesis, indicating that the HCV-specific 
nucleic acid signal was due to RNA. The analysis shown is for chimpanzee #1535, which 
10 received the highest level of inoculated HCV RNA and where the template DNA had not 
been degraded by digestion with DNase I. 

FIGURE 13. Circulating HCV RNA from inoculated animals is protected from RNAase. In 
lane 1, 10 /xl serum was mixed with 3 x 10^ molecules of competitor RNA, digested with 

15 0.5 fjig RNase A for 15 min at room temperature, extracted with RNAzol and utilized for 
nested RT-PCR as described in [Kolykhalov, 1996, supra] . For the sample shown in lane 
2, competitor RNA was added after lysis with RNAzol (no RNAse treatment). In lane 3, 
10 fx\ serum without competitor RNA was predigested with RNase A prior to extraction 
with RNAzol as in lane 1. Lane 4 is a negative control for RT-PCR. The experiment 

20 demonstrated that HCV RNA containing material from the transfected chimps is RNase- 
resistant under conditions where an excess of competitor RNA is completely destroyed. 
The sample analyzed was from chimpanzee #1536 at week 6, in which the RNA titer was 6 
X 10^ molecules/ml. 

25 DETAILED DESCRIPTION OF THE INVENTION 

As pointed out above, the present invention advantageously provides an authentic hepatitis 
C virus (HCV) nucleic acid, e.g., DNA or RNA, clone. A functional HCV nucleic acid of 
the invention advantageously provides for infection of susceptible animals and cell lines. 
Despite arduous efforts, infectious HCV has not previously been successfully cloned, thus 

30 precluding systematic evaluation of the virus's mechanisms of replication, receptor binding 
and cell invasion, development of antiviral therapeutic agents using in vitro and in vivo 
assay systems, and development of sensitive in vitro diagnostic assay systems. In addition, 
the clones of the invention now enable expression of HCV particles and particle proteins 



wo 98/39031 




;T/US98/04428 



27 

under conditions that permit proper processing, and thus expression of proteins that bear the 
closest possible structural resemblance to native HCV. Such particles and proteins are 
preferred for anti-HCV vaccine development. In addition, by identifying the elements of 
the HCV genome that are necessary for infection, the present inventors advantageously 
5 harness the properties of HCV that lead to chronic liver infection for preparation of gene 
therapy vectors. Such vectors are particularly useful since they target tte liver, which is a 
source of many proteins and thus a desirable organ for expression of a soluble factor to 
supplement a deficiency in a subject. 

10 The present invention is based, in part» on generation of a functional genotype la cDNA 
clone, which can be used as a basis for preparation of functional clones for other HCV 
genotypes (eg. , constructed and verified using similar methods). These products have a 
variety of applications for development of (i) more effective HCV therapies; (ii) HCV 
vaccines; (iii) HCV diagnostics; and (iv) HCV-based gene expression vectors. Examples of 

15 these applications are described below. 

The current invention describes the determination of an HCV consensus sequence and the 
use of this information to construct full-length HCV cDNA clones capable of yielding 
replication-competent infectious RNA transcripts. The rigorous determination of terminal 
20 sequences, including the discovery of highly conserved sequences at the 5' and 3' ends, the 
use of less error-prone methods for amplifying and assembling HCV cDNA clones, and the 
assembly of clones reflecting a consensus sequence, all contributed to the success of the 
present invention. 

The term "authentic" is used herein to refer to an HCV nucleic acid, whether a DNA (Le,, 
cDNA) or RNA, that provides for full genomic replication and production of functional 
HCV proteins, or components thereof. In a specific embodiment, an authentic HCV 
nucleic acid is infectious, e.g., in a chimpanzee model or in tissue culture, forms viral 
particles (Le. , virions), or both. However, an authentic HCV nucleic acid of the invention 
may also be attenuated, such Aat it only produces some (not all) functional HCV proteins, 
or it can productively infect cells without rq)lication in the absence of a helper cell line or 
plasmid, etc. The authentic HCV exemplified in the present application contains all of die 
virus-encoded information, whether in RNA elements or encoded proteins, necessary for 
imtiation of an HCV replication cycle that corresponds to rqplication of wild-type vims in 
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vivo. The specific HCV clones described herein, including the embodiment deposited with 
the ATCC and variants thereof described or exemplified in this application, represent a 
preferred starting material for developing HCV therapeutics, vaccines, diagnostics, and 
expression vectors. In particular, use of the HCV nucleic acids of the invention assures that 
authentic HCV components are involved, since, unlike the cloned HCVs of the prior art, 
these components together provide an infectious pjeteffiS The specific starting materials 
described herein, and preferably the deposited plasmid clone harboring authentic HCV 
cDNA, can be modified as described herein, c.g.y by site-directed mutagenesis, to produce 
a defective or attenuated derivative. Alternatively, sequences from other genotypes or 
isolates can be substituted for the homologous sequence of the specific embodiments 
described herein. For example, an authentic HCV nucleic acid of the invention may 
comprise the consensus 5' and 3' sequences disclosed herein, e.g., on a recipient plasmid. 
and a polyprotein coding region from another isolate or genotype (either a consensus region 
or one obtained by very high fidelity cloning) is substituted for the homologous polyprotein 
coding region of the HCV exemplified herein. In addition, the general characteristics for 
an authentic HCV as described herein, including but not limited to containing extreme 5' or 
3* sequences, or both, containing an ORF that encodes a polyprotein whose cleavage 
products form functional components of HCV virus particles and RN A replication 
machinery, and, in a preferred embodiment, incorporate a consensus sequence of a specific 
isolate or genotype provide for obtaining authentic HCV clones. 

In particular, the present invention provides for modifying or "correcting" non-functional 
HCV clones, e.g., that are incapable of genuine replication, that fail to produce HCV 
proteins, that do not produce HCV RNA as detected by Nordiem analysis, or that fail to 
infect susceptible animals or cell lines in vitro. By comparing an authentic HCV nucleic 
acid sequence of the invention, e,g. , the cDNA sequence of SEQ ID N0:1. with the 
sequence of the non-functional HCV clone, defects in the non-functional clone can be 
identified and corrected. All of the methods for modifying nucleic acid sequences available 
to one of skill in the art to effect modifications in the non-functional HCV genome, 
including but not limited to site-directed mutagenesis, substitution of the functional 
sequence from an authentic HCV clone, e.g. . of SEQ ID NO:l, for the homologous 
sequence in the non-functional clone, etc. 



The term "consensus sequence" is used herein to refer to a functional HCV genomic 
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sequence, or any portion thereof, including the 5'-NTR, polyprotein coding sequence or 
portion thereof, and 3*-NTR, which is determined by identifying the consensus residues 
from three or more, preferably six or more, independent clones of a strain or genotype of 
HCV. In the Examples, infra, 5'-NTR (including some capsid proteins from the 
polyprotein coding region) and 3'-NTR (including some portion of the genome encoding the 
C-terminus of the polyprotein) consensus sequences were determined and incorporated in a 
recipient plasmid (Example 3). Consensus sequences for the majority of the polyprotein 
coding region from a KprA site to a Noil site were also determined, as shown in Figure 8 
and Example 4, infra, which yielded a consensus sequence. Insertion of the Kpnl and Noll 
portion of the polyprotein coding sequence are inserted in the recipient plasmid containing 
consensus 5' and 3' consensus sequences, yields an authentic HCV genomic DNA clone. 

The authentic HCV nucleic acid of the invention preferably includes a 5'-NTR extreme 
conserved sequence comprising the 5 '-terminal sequence GCCAGCC, which may have 
additional bases upstream of this conserved sequence without affecting functional activity of 
the HCV nucleic acid. In a preferred embodiment, the 5 '-GCCAGCC includes from 0 to 
about 10 additional upstream bases; more preferably it includes from 0 to about 5 upstream 
bases; more preferably still it includes 0, one, or two upstream bases. In specific 
embodiments, the extreme 5 '-terminal sequence may be GCCAGCC; GGCCAGCC; 
UGCCAGCC; AGCCAGCC; AAGCCAGCC; GAGCCAGCC; GUGCCAGCC; or 
GCGCCAGCC, wherein the sequence GCCAGCC is the 5'-terminus of SEQ ID N0:3. 

In an authentic HCV nucleic acid of the invention, the 3'-NTR comprises a long poly- 
pyrimidine region. In positive-strand HCV RNA. the region corresponds to a 
poly(U)/poly(UC) tract. Naturally, in positive-strand HCV DNA, this is a 
poly(T)/poly(TC) tract. The Examples, infra, show that die polypyrimidine tract may be of 
variable length: both short (about 75 bases) and long (133 bases) are effective, although an 
HCV clone containing a long poly(U/UC) tract is found to be highly infectious. Longer 
tracts may be found in naturally occurring HCV isolates. Thus, an authentic HCV nucleic 
acid of the invention may have a variable length polypyrimidine tract. 

In a specific embodiment of the invention, plasmid p90/HCVFL [long poly(U)] harboring a 
cDN A encoding an infectious HCV RNA under control of a phage promoter was deposited 
with the American Type Culture Collection (ATCC), 12301 Parklawn Drive. Rockville, 
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Maryland, United States of America on February 13, 1997 on behalf of Washington 
University School of Medicine for the purpose of compliance with the Budapest Treaty on 
the International Recognition of the Deposit of Microorganisms for the Purposes of Patent 
Protection in accordance with its provisions, and the provisions of 37 C.F.R. § 1 .801 et 
seq. 



The benefits of this technology are enormous and far reaching. Of immediate significance 
is use of HCV cDNA from these functional clones as starting material for studies on the 
functions of individual HCV proteins and RNA elements using biochemical, cell culture, 
and transgenic animal approaches. The use of functional cDNA will minimize the chances 
of obtaining negative or misleading results because of errors introduced during cDNA 
synthesis or PCR-amplification. Such clones will also provide defined starting material for 
future molecular genetic studies on many aspects of HCV biology in the context of 
authentic virus replication. Uses relevant to therapy and vaccine development include: (i) 
the generation of defined HCV virus stocks to develop in vitro and in vivo assays for virus 
neutralization, attachment, penetration and entry; (ii) structure/function studies on HCV 
proteins and RNA elements and identification of new antiviral targets; (iii) a systematic 
survey of cell culture systems and conditions to identify those that support HCV RNA 
replication and particle release; (iv) production of adapted HCV variants capable of more 
efficient replication in cell culture; (v) production of HCV variants with altered tissue or 
species tropism; (vi) establishment of alternative animal models for inhibitor evaluation 
including those supporting HCV replication; (vii) development of cell-free HCV replication 
assays; (viii) production of immunogenic HCV particles for vaccination; (ix) engineering of 
attenuated HCV derivatives as possible vaccine candidates; (x) engineering of attenuated or 
defective HCV derivatives for expression of heterologous gene products for gene ther^y 
and vaccme applications; (xi) utilization of the HCV glycoproteins for targeted delivery of 
therapeutic agents to the liver or other cell types with appropriate receptors. 

Various terms are used herein, which have the following definitions: 

The phrase "pharmaceutically accqjtable" refers to molecular entities and compositions that 
are physiologically tolerable and do not typically produce an allergic or similar untoward 
reaction, such as gastric upset, dizziness and the like, when administered to a human. 
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Preferably, as used herein, the term "pharmaceutically acceptable" means approved by a 
regulatory agency of the Federal or a state government or listed in the U.S. Pharmacopeia 
or other generally recognized pharmacopeia for use in animals, and more particularly in 
humans. The term "carrier" refers to a diluent, adjuvant, excipient, or vehicle with which 
the compound is administered. Such pharmaceutical carriers can be sterile liquids, such as 
water and oils, including those of petroleum, animal, vegetable or synthetic origin, such as 
peanut oil, soybean oil, mineral oil, sesame oil and the like. Water or aqueous solution 
saline solutions and aqueous dextrose and glycerol solutions are preferably employed as 
carriers, particularly for injectable solutions. Suitable pharmaceutical carriers are described 
in "Remington's Pharmaceutical Sciences" by E.W. Martin. 

The phrase "therapeutically effective amount" is used herein to mean an amount sufficient 
to reduce by at least about 15 percent, preferably by at least 50 percent, more preferably by 
at least 90 percent, and most preferably prevent, a clinically significant deficit in the 
activity, function and response of the host. Alternatively, a therapeutically effective amount 
is sufficient to cause an improvement in a clinically significant condition in the host. 



The term "adjuvant" refers to a compound or mixture that enhances the immune response to 
an antigen. An adjuvant can serve as a tissue depot that slowly releases the antigen and 
also as a lymphoid system activator that non-specifically enhances the inunune response 
(Hood et al.. Immunology, Second Ed., 1984, Benjamin/Cummings: Menlo Park, 
California, p. 384). Often, a primary challenge with an antigen alone, in the absence of an 
adjuvant, will fail to elicit a humoral or cellular immime response. Adjuvants include, but 
are not limited to, complete Freund's adjuvant, incomplete Freund's adjuvant, saponin, 
mineral gels such as alimiinum hydroxide, surface active substances such as lysolecithin, 
pluronic polyols, polyanions, peptides, oil or hydrocarbon emulsions, keyhole limpet 
hemocyanins. dinitrophenol, and potentially useful human adjuvants such as BCG {bacille 
Calmette-Guerin) and Corynebacterium parvum. Preferably, the adjuvant is 
pharmaceutically acceptable. 

In a specific embodiment, the term "about" or "approximately" means within 20%, 
preferably within 10%, and more preferably within 5% of a given value or range. 

The following subsections of the application, which further amplify the foregoing 
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disclosure, are provided for convenience and not by way of limitation. 

Fvmmm] FMll-lpneth Clones for Oth^r hpv and n^nntvp ^c 
Using the approaches described here, functional full-length clones for the other HCV 
genotypes can be built and utilized for biological studies and antiviral screening and 
evaluation. In this extension of the invention, libraries can be constructed using RNA from 
single-exposure patients with high RNA titers (greater dian 10»/ml) and known clinical 
history. A consensus sequence for the isolate can be generated from the sequences of 
individual clones in the library. New recipient plasmids containing a promoter, 5' and 3' 
terminal consensus sequences (either determined for that isolate or from a different isolate 
e.g., HCV-H77). and a 3' restriction site for production of run-off transcripts can be 
constructed. 



As less error-prone mediods emerge, screening of a limited number of clones from 
combinatorial libraries may yield function clones. Alternatively, as described here, 
sequence of derived from multiple clones and directed assembly can be used to produce 
fimctional consensus clones. 

Thus, the present invention contemplates isolation of other HCV genomic sequences, or 
consensus genomic sequences. In accordance with the present invention there may be 
employed conventional molecular biology, microbiology, and recombinant DNA techniques 
within tittskUl of die art. Such techniques are explained fully in the literature. See.e.^.. 
Sambrook. Fritsch & Maniatis. Molecular Ooning: A Laboratory Manual. Second Edition 
(1989) Cold Spring Harbor Uboiatoiy Press. Cold Spring Harbor, New York (herein 
-Sambrook era/., 1989"); DNA Ooning: A Practical Approach. Volumes I and H (D.N. 
Glover ed. 1985); OUgonucleotide Synthesis (M-J. Gaited. 1984); Nucleic Acid 
Hybridization [B.D. Hames & S.J. Higgi» eds. (1985)1; Transcription And Translation 
IB.D. Hames & S.J. Higgins, eds. (1984)]; Animal CeU Culture IR.I. Freshney, ed. 
(1986)]; Inmobilized Cells Aiul Enzymes [IRL Press. (1986)]; B. Perbal. A Paictical Guide 
To Molecular Cloning il9U), F.M. AusubeUr«/.(eds.), Current Protocols in Molecular 
Biology, John Wiley & Sons, Inc. (1994). 



Therefore, if appearing herein, the following terms shall have the definitions 



set out below. 
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It should be appreciated that the terms HCV sequence, such as the "3' terminal sequence 
element." "3' terminus," "3' sequence element," are meant to encompass al! of the 
following sequences: (i) an RNA sequence of the positive-sense genome RNA; (ii) the 
complement of this RNA sequence, i.e., the HCV negative-sense RNA; (iii) the DNA 
sequence corresponding to the positive-sense sequence of the RNA element; and (iv) the 
DNA sequence corresponding to the negative-sense sequence of the RNA element. 
Accordingly, nucleotide sequences displaying substantially equivalent or altered properties 
are likewise contemplated. These modifications may be deliberate, for example, such as 
modifications obtained through site-directed mutagenesis, or may be accidental, such as 
those obtained through mutations in hosts that are producers of the complex or its named 
subunits. 

A "vector" is a replicon, such as a plasmid, phage, or cosmid, to which another DNA (or 
RNA) segment may be joined so as to bring about the replication of the attached segment. 
A "cassette" refers to a segment of DNA RNA that can be inserted into a vector at specific 
restriction sites. The segment of DNA or RNA encodes a polypeptide or RNA of interest, 
and the cassette and restriction sites are designed to ensure insertion of the cassette in the 
proper reading frame for transcription and translation. 

Transcriptional and translational control sequences are DNA or RNA regulatory sequences, 
such as promoters, enhancers, polyadenylation signals, terminators, IRES elements, and the 
like, that provide for the expression of a coding sequence in a host cell. A coding sequence 
is "under the control of" or "operably (also operatively) associated with" transcriptional and 
translational control sequences in a cell when RNA polymerase transcribes the coding 
sequence into RNA. RNA sequences can also serve as expression control sequences by 
virtue of their ability to modulate translation, RNA stability, RNA replication, and RNA 
transcription (for RNA viruses). 

A "promoter sequence" is a DNA or RNA regulatory region capable of binding RNA 
polymerase in a cell and initiating transcription of a downstream (3' direction) coding or 
noncodmg sequence. Thus, promoter sequences can also be used to refer to analogous 
RNA sequences or structures of similar function in RNA virus replication and transcription. 
Preferred promoters for cell-free or bacterial expression of infections HCV DNA clones of 
the invention are the phage promoters T7, T3, and SP6. Alternatively, a nuclear promoter. 
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such as cytomegalovirus iininediat<«arly promoter, can be used. Indeed, depending on the 
system used, expression may be driven from a eukaryotic. prokaryotic, or viral promoter ' 
element. Promoters for expression of HCV RNA can provide for capped or uncapped 
transcripts. 



As used herein, the term "homologous" in all its grammatical forms and speUing variations 
refers to the relationship between proteins that possess a "common evolutionary origin. " 
including proteins from superfamilies (e.g.. the immunoglobulin superfemily) and 
homologous proteins from different species (e.g., myosin light chain, etc.) (Reeck etal.. 
Cell 50:667 (1987)]. Such proteins (and their encoding genes) have a high degree of 
sequence similarity. The term "sequence similarity" in all its grammatical forms refers to 
the degree of identity or correspondence between nucleic acid or amino acid sequences of 
proteins that may or may not share a common evolutionary origin [see Reeck.rt al.. supra]. 
However, in common usage and in the instant application, the term "homologous." when 
modified with an adverb such as "substantially" or "highly." may refer to sequenle 
similarity and not a conunon evolutionary origin. 

In a specific embodiment, two DNA or RNA sequences are "homologous" or "substantiaUy 
similar" when at least about 50% (preferably at least about 75%, and most preferably at 
least about 90 or 95%) of the nucleotides match over the defined length of the DNA 
sequences. Sequences that are substantially homologous can be identified by comparing die 
sequences using standard software avaUable in sequence data banks, or in a Southern 
hybridization experimem under, for example, stringent conditions as defined for that 
particular system. Defining appropriate hybridization conditions is within the skiU of the 
ait. See. e.g., Maniatis ao/.. supra; DNA Cloning. Vols. I & H. supra; Nucleic AcM 
Hybridization, supra. 

Snnilarly, in a particuhir embodiment, two amino acid sequences are "homologous" or 
"substantially similar" when greater fluui 30% of the amino acids are identical, or greater 
than about 60% are sfanilar (functionally identical). Preferably, tiie similar or homologous 
sequences are identified by aUgmnent using, for example, the GCG (Genetics Computer 
Group. Program Mamud for Ak GCG Package, Venion 7, Madison. Wisconsin) pUeup 
program. 
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The term "corresponding to" in relation to nucleic acid or amino acid structure is used 
herein to refer similar or homologous sequences, whether the exact position is identical or 
different from the molecule to which the similarity or homology is measured. A nucleic 
acid or amino acid sequence alignment may include gaps. Thus, the term "corresponding 
to" refers to the sequence similarity or regions of homology, and not the numbering of the 
amino acid residues or nucleotide bases. 

HCV genomic nucleic acids can be isolated from any source of infectious HCV, 
particularly from tissue samples (blood, plasma, serum, liver biopsy, leukocytes, etc.) from 
an infected human or simian, or other permissive animal species. Methods for obtaining 
genomic HCV clones or portions thereof are well known in the art, as described above 
[see, e.g., Sambrook etaL, 1989, supra]. HCV isolates, including polyprotein coding 
region sequences, are described, for example, in International Patent Publication WO 
89/04669, published June 1, 1989 by Houghton et al.; International Patent Publication WO 
90/11089, published October 4, 1990 by Houghton et al.; U.S. Patent No. 5,350,671, 
issued September 27, 1994 to Houghton et al.; U.S. Patent No. 5,372,928, issued 
December 13, 1994 to Miyamura et al.; European Patent Application No. EP 0 521 318 
A2, published January 7, 1993 for Cho et al.; and European Patent Application No. EP 0 
510 952 Al, published October 28, 1992, each of which is incorporated herein by reference 
in its entirety. Representative genotypes further include, but are by no means restricted to, 
other la isolates, lb, Ic, 2a. 2b, 2c. 3a, etc. [Bukh et aL, (1995) supra; Simmonds, 
Hepatology 21: 570-83 (1995); Simmonds etaL. Hepatology 19: 1321-1324 (1994); 
SxmmQnisetaL.l Gen, Virol 77:3013-3024 (19960]. For many subtypes and genotypes, 
enough sequence data are available to design primers for RT/PCR and PGR assembly. 

In the molecular cloning genomic HCV RNA or DNA, DNA fragments are generated, e.g., 
by reverse transcription into cDNA and PCR, These fragments may be assembled to form 
a fiill length sequence. Preparation of many such fragments provides a combinatorial 
library of HCV clones. Such a library may yield an mfectious clone; more likely, the 
consensus sequence should be determined by comparing the sequences of all or a significant 
number of clones from such a library. Enough clones should be evaluated so that a 
majority of bases at any divergent position are idendcal. Thus, a consensus may be 
determined by analyzing the sequence of at least three clones, preferably about five clones, 
and more preferably six or more clones. Naturally, the more error-prone the cloning 
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method, the greater the number of clones that should be sequenced to yield an authentic 
HCV consensus sequence. 

The consensus sequence can then be used to prepare aii infectious HCV DNA clone. The 
fidelity of the resulting clones is preferably established by sequencing. However, selection 
can be carried out on the basis of the properties of die clone, e.g., if the clone encodes an 
infectious HCV RNA. Thus, successful preparation of an infectious HCV DNA clone may 
be detected by assays based on the physical, pathological, or immunological properties of 
an animal or cell culture transfected or infected with the clone. For example. cDNA clones 
can be selected that produce an HCV virion or virus particle protein that, e.g. . has similar 
or identical physical-chemical, electrophoretic migration, isoelectric focusing, or non- 
equilibrium pH gel electrophoresis behavior, proteolytic digestion maps, or antigenic 
properties as known for native HCV or HCV virus particle proteins. 

Components of Junctional HCV cDNA clones. Components of the fiinctional HCV cDNA 
described in this invention can be used to develop cell-free, cell culture, chimeric virus, and 
animal-based screening assays for known or newly identified HCV antiviral targets as 
described ir^a. Examples of known or suspected targets and assays include [see 
Houghton, Jn "Fields Virology" (B. N. Fields, D. M. Knipe and P. M. Howley, Eds.), Vol. 
pp. 1035-1058. Raven Press, New York (1996); Ric6, (1996) supra; Rice et al.. Antiviral 
Therapy 1, Suppl. 4, 1 1-17 (1997); Shimotohno, Hepatology 21,:887-8 (1995) for 
reviews], but are not limited to, the following: 

The highly conserved 5' NTR. which contains elements essential for translation of the 
incoming HCV genome RNA. is one target. It is also likely that this sequence, or its 
complement, contains RNA elements important for RNA replication and/or packaging. 
Potential therapeutic strategies inchide: antisense oligonucleotides {supra); trans-acting 
ribozymes {supra); RNA decoys; small molecule compounds interfering with the function 
of this element (these could act by binding to the RNA element itself or to cognate viral or 
cellular factors required for activity). 

Another target is the HCV C (capsid or core) protein which is highly conserved and is 
associated with the following fimctions: RNA binding and specific encapsidation of HCV 
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genome RNA; transcriptional modulation of cellular [Ray et al.. Virus Res, 37: 209-220 
(1995)] and other viral [Shih et al.. . J. Virol. 69: 1 160-1 171 (1995); Shih et al, J. Virol 
67: 5823-5832 (1993)] genes; cellular transformation [Ray etaL, 1 Virol 70: 4438-4443 
(1996a)];preventionof apoptosis[Ray era/.. Virol 226: 176-182 (1996b)]; modulation of 
host immune response through binding to members of the TNF receptor superfamily 
[Matsumotoera/.,-/. Virol 71: 1301-1309(1997)]. 

The El, E2, and E2-p7 glycoproteins which form the components of the virion envelope 
and are targets for potentially neutralizing antibodies. Key steps for intervention include: 
signal peptidase mediated cleavage of these precursors from the polyprotein [Lin et al., 
(1994a ) suprd\\ ER assembly of the E1E2 glycoprotein complex and association of these 
proteins with cellular chaperones and folding machinery [Dubuisson et al, (1994) supra; 
Dubuisson and Rice, / Virol 70: 778-786 (1996)]; assembly of virus particles including 
interactions between the nucleocapsid and virion envelope; transport and release of virus 
particles; the association of virus particles with host components such as VLDL [Hijikata et 
al., (1993) supra; Thomssen etai, (1992) supra; Thomssen et al, Med Microbiol 
Immunol 182: 329-334 (1993)] which may play a role in evasion of immune surveillance 
' or in binding and entry of cells expressing the LDL receptor; conserved and variable 

determinants in the virion which are targets for neutralization by antibodies or which bind 
to antibodies and facilitate immune-enhanced infection of cells via interaction with cognate 
Fc receptors; conserved and variable determinants in the virion important for receptor 
binding and entry; virion determinants participating in entry, fusion with cellular 
membranes, and uncoating the incoming viral nucleocapsid. 

The NS2-3 autoprotease, which is required for cleavage at the 2/3 site is a further target. 

The NS3 serine protease and NS4A cofactor which form a complex and mediate four 
cleavages in the HCV polyprotein [see Rice, (1997) supra for review) is yet another 
suitable target. Targets include die serine protease activity itself; the tetrahedral Zn^+ 
coordination site in the C-terminal domain of die serine protease; die NS3-NS4A cofactor 
interaction; die membrane association of NS4A; stabilization of NS3 by NS4A; 
transforming potential of die NS3 protease region [Sakamuro et aL, J Virol 69: 3893-6 
(1995)]. 
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The NS3 RNA-stimulated NTPase [Suzich et al.. (1993) supra], RNA helicase [Jin and 
Peterson. Arch Biochem Biophys 323: 47-53 (1995); Kim et al.. Biochem. Biophys. Res. 
Commun. 215: 160-6 (1995)], and RNA binding [Kanai etal., FEBSLetl 376: 221-4 
(1995)] activities; tlie NS4A protein as a component of the RNA replication complex of as 
yet undefined function; the NS5A protein, another presumed replication component, is 
phosphorylated predominantly on serine residues [Tanji etal.. J. ViroL 69: 3980-3986 
(1995)] are all targets for drug development. Possible characteristics of the latter which 
could be targets for therapy include the kinase responsible for NS5A phosphorylation and 
its interaction with NS5A; the interaction with NS5A and other components of the HCV 
replication complex. 

The NS5B RDRP, which is the enzyme responsible for the actual synthesis of HCV positive 
and negative-strand RNAs, is another target. Specific aspects of its activity include the 
polymerase activity itself [Behrens et al., EMBOJ. 15: 12-22 (1996)]; interactions of NS5B 
with other replicase components, including the HCV RNAs; steps involved in the initiation 
of negative- and positive-strand RNA synthesis; phosphorylation of NS5B [Hwaijg et al.. 
Virology 227:438 (1997)]. 

Other targets include structural or nonstrucmral protein functions important for HCV RNA 
replication and/or modulation of host cell function. Possible hydrophobic protein 
components capable of forming channels important for viral entry, egress or modulation of 
host cell gene expression may be targeted. 

The 3' NTR, especially the highly conserved elements (poly (U/UC) tract; 98-base terminal 
sequence) can be targeted. Therapeutic approaches parallel those described for the 5' 
NTR, except that this portion of the genome is likely to play a key role in the initiation of 
negative-strand syntiiesis. It naay also be involved in other aspects of HCV RNA 
replication, including translation, RNA stability, or packaging. 

The functional HCV cDNA clones encode all of the viral proteins and RNA elements 
required for RNA packaging. These elements can be taigeted for development of antiviral 
compounds. Electrophoretic mobility shift , UV cross-linking, filter binding, and three- 
hybrid [SenGupta et al.. Proc. Natl Acad. Sci. USA 93: 8496-8501 (1996)] assays can be 
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used to define the protein and RNA elements important for HCV RNA packaging and to 
establish assays to screen for inhibitors of this process. Such inhibitors might include small 
molecules or RNA decoys produced by selection in vitro [Gold et ai, (1995) supra]. 

Complex HCV libraries can be prepared using PCR sherffling, or by incorporating 
randomized sequences, such as are generated in "peptide display" libraries. Using (he 
"phage method" [Scott and Smith. 1990. Science 249:386-390 (1990); Cwirla. et a/.. Proc. 
mi Acad. Sci,. 87:6378-6382 (1990); Devlin etaL, Science, 249:404-406 (1990)], very 
large libraries can be constructed (10*^-10* chemical entities). As noted above, and 
exemplified infra, clones from such libraries can be used to generate a consensus genomic 
sequence. 

Due to the degeneracy of nucleotide coding sequences, other DNA sequences that encode 
substantially the same amino acid sequence as an HCV polyprotein coding region may be 
used in the practice of the present invention. These include but are not limited to 
homologous genes from other species, and nucleotide sequences comprising all or portions 
of HCV polyprotein genes altered by the substitution of different codons that encode the 
same amino acid residue within the sequence, thus producing a silent change. Such silent 
changes permit creation of genomic markers, which can be used to identify a particular 
infectious isolate in a multiple infection animal model. Likewise, the HCV genomic 
derivatives of the invention include, but are not limited to, those containing, as a primary 
amino acid sequence, all or part of the amino acid sequence of an HCV polyprotein 
including altered sequences in which functionally equivalent amino acid residues are 
substituted for residues within the sequence resulting in a conservative amino acid 
substitution. For example, one or more amino acid residues within the sequence can be 
substituted by another amino acid of a similar polarity, which acts as a functional 
equivalent, resulting in a silent alteration. Substitutes for an amino acid within the sequence 
may be selected from other members of the class to which the amino acid belongs. For 
exanq)le, die nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, 
valine, proline, phenylalanine, tryptophan and methionine. Amino acids containing 
aromatic ring strucmres are phenylalanine, tryptophan, and tyrosine. The polar neutral 
amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine. The positively charged (basic) amino acids include arginine, lysine and 
bistidine. Hie negatively charged (acidic) amino acids include aspardc acid and glutamic 
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acid. 

Particularly preferred substitutions are: 

- Lys for Arg and vice versa such that a positive charge may be maintained; 

- Glu for Asp and vice versa such that a negative charge may be maintained; 

- Ser for Thr such that a free -OH can be maintained; and 

- Gin for Asn such that a free NHj can be maintained. 

In another embodiment, an authentic HCV clone can be modified to introduce amino acid 
substitutions that reduce or eliminate protein function. An authentic HCV clone can also be 
modified to introduce amino acid substitutions that alter viral tropism. 

Moreover, since HCV lacks proofreading activity, the virus itself readily mutates, forming 
mutant "quasi-species" of HCV that are also contemplated as within the present invention. 
Such mutations are easily identified by sequencing isolates from a subject, as detailed 
herein. 

The clones encoding HCV derivatives and analogs of the invention can be produced by 
various methods known in the art. The manipulations which result in their production can 
occur at the gene or protein level. For example, the cloned HCV genome sequence can be 
modified by any of numerous strategies known in the art [Sambrook et a!., 1989, supra]. 
The genomic sequence can be cleaved at appropriate sites with restriction endonuclease(s), 
followed by further enzymatic modification if desired, isolated, and ligated in vitro. 
Alternatively, genomic fragments can be joined, e.g. , with PCR, to create an HCV 
genome. In the production of the genomic nucleic acid derivative or analog of HCV, care 
should be taken to ensure that the modified genome remains within the same translational 
reading frame as the native HCV genome, uninterrupted by translational stop signals, in the 
region where the desired activity is encoded. 

The HCV polyprotein-encoding nucleic acid sequence can be mutated in vitro or in vivo^ to 
create and/or destroy translation, initiation, and/or termination sequences, or to create 
variations in coding regions and/or form new restriction endonuclease sites or destroy 
preexisting ones, to facilitate further in vitro modification. Preferably, such mutations 
provide for modification of the functional activity of the HCV, e.g.^ to attenuate viral 
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activity, or create a defective virus, as set forth infra. Any technique for mutagenesis 
known in the art can be used, including but not limited to, in vitro site-directed mutagenesis 
[Hutchinson, C, et al„ 1978, J. Biol. Chem. 253:6551; Zoller and Smith, 1984, DNA 
3:479-488; Oliphant et al, 1986, Gene 44:177; Hutchinson et aL, 1986, Proc. Natl. Acad. 
Sci. U.S.A. 83:710], use of TAB® linkers (Pharmacia), etc. PGR techniques are preferred 
for site directed mutagenesis [see Higuchi, 1989, "Using PGR to Engineer DNA", in PCR 
Technology: Principles md Applications for DNA Amplification, H. Erlich, ed., Stockton 
Press, Chapter 6, pp. 61-701, 

Adaptation ofHCVfor more efficient replication in cell culture or alternative hosts. As 
mentioned earlier, HGV replication in cell culture is inefficient. The engineering of 
dominant selectable makers under the control of the HGV replication machinery can also be 
used to select for adaptive mutations in the HGV replication machinery. Such adaptive 
mutations could be manifested, but are not restricted to: (i) altering the tropism of HGV 
RNA replication; (ii) altering viral products responsible for deleterious effects on host cells; 
(iii) increasing or decreasing HGV RNA replication efficiency; (iv) increasing or decreasing 
HGV RNA packaging efficiency and/or assembly and release of HGV particles; (v) altering 
cell tropism at the level of receptor binding and entry. Even if the sequence of an HGV 
original cDNA clone is incompatible with establishing replication m a particular cell type, 
mutations occurring during in vitro transcription, during the initial stages of HGV-mediated 
RNA synthesis, or incorporated in the template DNA by a variety of chemical or biological 
methods, supra, may allow replication in a particular cellular environment or animal host. 
The engineered dominant selectable marker, whose expression is dependent upon 
productive HGV RNA replication, can be used to select for adaptive mutations in either the 
HGV replication machinery or the transfected host cell, or both. 

Chimeric HCV clones. Components of these functional clones can also be used to construct 
chimeric viruses for assay of HGV gene functions and inhibitors thereof [Filocamo et al,, J. 
Virol 71: 1417-1427 (1997); Hahm et ai. Virology 226: 318-326 (1996); Lu and 
Wimmer, Proc Natl Acad Sci USA 93: 1412-7 (1996)]. In one such extension of the 
invention, functional HGV elements such as the 5' IRES, proteases, RNA helicase, 
polymerase, or 3' NTR are used to create chimeric derivatives of BVDV whose productive 
replication is dependent on one or more of these HCV elements. Such BVDV/HCV 
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chimeras can then be used to screen for and evaluate antiviral strategies against these 
functional components. 

In addition, dominant selectable markers can be used to select for mutations in the HCV 
replication machinery that allow higher levels of RNA replication or particle formation. In 
one example, engineered HCV derivatives expressing a mutant form of DHFR can be used 
to confer resistance to methotrexate (MTX). As a dominant selectable marker, mutant 
DHFR is inefficient since nearly stoichiometric amounts are required for MTX resistance. 
By successively increasing concentrations of MTX in the medium, increased quantities of 
DHFR will be required for continued survival of cells harboring the replicating HCV RNA, 
This selection scheme, or similar ones based on this concept, can result in the selection of 
mutations in the HCV RNA replication machinery allowing higher levels of HCV RNA 
replication and RNA accumulation. Similar selections can be applied for mutations 
allowing production of higher yields of HCV particles in cell culture or for mutant HCV 
particles with altered cell tropism. Such selection schemes involve harvesting HCV 
particles from culture supematants or after cell disruption and selecting for MTX-resistant 
transducing particles by reinfection of naive cells. 

The identified and isolated genomic RNA can be reverse transcribed into its cDNA. cDNA 
could also be made by "long" PCR to include the promoter and run-off site, or by using 3'- 
terminal consensus sequence-specific primers for insertion in an appropriate recipient 
vector. Any of these cDNAs may be inserted into an appropriate cloning vector, e.g. , 
which comprises consensus 5'- and 3'-NTRs, along with a suitable promoter and 3 '-runoff 
sequence. A clone that includes a primer and run-off sequence can be used directly for 
production of functional HCV RNA. A large number of vector-host systems known in the 
art may be used. Examples of vectors include, but are not limited to, E. coli, 
bacteriophages such as lambda derivatives, or plasmids such as pBR322 derivatives or pUC 
plasmid derivatives, e.g., pGEX vectors, pmal-c, pFLAG, pTET, etc. The insertion into a 
cloning vector can, for example, be accomplished by ligating the DNA fragment into a 
cloning vector which has complementary cohesive termini. However, if the complementary 
restriction sites used to fragment the DNA are not present in the cloning vector, the ends of 
the DNA molecules may be enzymatically modified. Alternatively, any site desired may be 
produced by ligating nucleotide sequences (linkers) onto the DNA termini; these ligated 
linkers may comprise specific chemically synthesized oligonucleotides encoding restriction 
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endonuclease recognition sequences. Recombinant molecules can be introduced into host 
cells via transformation, transfection, infection, electroporation, etc.. so that many copies 
of the gene sequence are generated. 



The HCV DNA, which codes for HCV RNA and HCV proteins, particularly HCV RNA 
replicase or virion proteins, can be inserted into an appropriate expression vector, i.e., a 
vector which contains the necessary dements for the transcription and translation of the 
inserted protein-coding sequence. Such elements are termed herein a "promoter." Thus, 
the HCV DNA of the invention is operationally (or operably) associated with a promoter m 
an expression vector of the invention. An expression vector also preferably includes a 
replication origin. The necessary transcriptional and translational signals can be provided 
on a recombinant expression vector. In a preferred embodiment for in vitro synthesis of 
functional RNAs, the T7, T3, or SP6 promoter is used. 

Potential host-vector systems include but are not limited to mammalian cell systems infected 
with virus recombinant {e.g., vaccinia virus, adenovirus, Sindbis virus, Semliki Forest 
virus, etc.); insect cell systems infected with recombinant viruses (e,g,, baculovirus); 
microorganisms such as yeast containing yeast vectors; plant cells; or bacteria transformed 
with bacteriophage, DNA, plasmid DNA, or cosmid DNA. The expression elements of 
vectors vary in their strengths and specificities. Depending on the host-vector system 
utilized, any one of a number of suitable transcription and translation elements may be 
used. 

The cell into which the recombinant vector comprising the HCV DNA clone has been 
introduced is cultured in an appropriate cell culture medium under conditions that provide 
for expression of HCV RNA or such HCV proteins by the cell. Any of the methods 
previously described for the insertion of DNA fragments into a cloning vector may be used 
to construct expression vectors containing a gene consisting of appropriate 
transcriptional/translational control signals and the protein coding sequences. These 
methods may include in vitro recombinant DNA and synthetic techniques and in vivo 
recombination (genetic recombination). 



Ry pressinn of HCV RNA and Polypeptides 



Expression of HCV RNA or protein may be controlled by any promoter/enhancer element 
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known in the art, but these regulatory elements must be functional in the host selected for 
expression. Promoters which may be used to control expression include, but are not limited 
to, the SV40 early promoter region (Benoist and Chambon, 1981 , Nature 290:304-310), the 
promoter contained in the 3' long terminal repeat of Rous sarcoma virus (Yamamoto, et al., 
1980, Cell 22:787-797). the herpes thymidine kinase promoter (Wagner et al, 1981, Proc. 
Natl. Acad. Sci. U.S.A. •>8: 1441-1445), the regulatory sequences of the metallothionein 
gene (Brinster et al., 1982, Nature 296:39-42); prokaryotic expression vectors such as the 
P-lactamase promoter (Villa-Kamaroff, etal., 1978, Proc. Natl. Acad. Sci. U.S.A. 
75:3727-3731), or the tac promoter (DeBoer, etaL, 1983, Proc. Natl. Acad. Sci. U.S.A. 
80:21-25); see also "Useful proteins from recombinant bacteria" in Scientific American, 
1980, 242:74-94; promoter elements from yeast or other fimgi such as the Gal 4 promoter, 
the ADC (alcohol dehydrogenase) promoter, PGK (phosphoglycerol kinase) promoter, 
alkaline phosphatase promoter; and the animal transcriptional control regions, which exhibit 
tissue specificity and have been utilized in transgenic animals: elastase 1 gene control region 
which is active in pancreatic acinar cells (Swift et aL, 1984, Cell 38:639-646; Omitz et aL, 
1986, Cold Spring Harbor Symp. Quant. Biol. 50:399-409; MacDonald, 1987, Hepatology 
7:425-515); insulin gene control region which is active in pancreatic beta cells (Hanahan, 

1985, Nature 315:115-122). immunoglobulin gene control region which is active in 
lymphoid cells (Grosschedl etal, 1984, Cell 38:647-658; Adames etaL, 1985, Nature 
318:533-538; Alexander etal., 1987, Mol. Cell. Biol. 7:1436-1444), mouse mammary 
tumor virus control region which is active in testicular, breast, lymphoid and mast cells 
(Leder et al„ 1986, Cell 45:485-495), albumin gene control region which is active in liver 
(Pinkert et aL, 1987, Genes and Devel. 1:268-276), alpha-fetoprotein gene control region 
which is active in liver (Krumlauf era/., 1985, Mol. Cell. Biol. 5:1639-1648; Hammer et 
ai, 1987, Science 235:53-58), alpha 1-antitrypsin gene control region which is active in the 
liver (Kelsey et al., 1987, Genes and Devel. 1:161-171), beta-globm gene control region 
which is active m myeloid cells (Mogram et al., 1985, Nature 315:338-340; KoUias et al., 

1986, Cell 46:89-94), myelin basic protein gene control region which is active in 
oligodendrocyte cells in the brain (Readhead et ai, 1987, Cell 48:703-712), myosin light 
chain-2 gene control region which is active in skeletal muscle (Sani, 1985, Nature 314:283- 
286), and gonadotropic releasing hormone gene control region which is active in the 
hypotiialamus (Mason etaL, 1986, Science 234:1372-1378). 
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A wide variety of host/expression vector combinations may be employed in expressing the 
DNA sequences of this invention. Useful expression vectors, for example, may consist of 
segments of chromosomal, non-chromosomal and synthetic DNA sequences. Suitable 
vectors include derivatives of SV40 and known bacterial plasmids, e.g,, E, coli plasmids 
col El, pCRl, pBR322. pMal-C2, pET, pGEX [Smith etal. 1988, Gene 67:31-40], pMB9 
and their derivatives, plasmids such as RP4; phage DNAS, e.g., the numerous derivatives 
of phage A., e.g., NM989, and other phage DNA, e.g., M13 and filamentous single 
stranded phage DNA; yeast plasmids such as the 2^ plasmid or derivatives thereof; vectors 
useful in cukaryotic cells, such as vectors useful in insect or mammalian ceils; vectors 
derived from combinations of plasmids and phage DN As, such as plasmids that have been 
modified to employ phage DNA or other expression control sequences; and the like known 
in the art. 

In addition to the preferred sequencing analysis, expression vectors containing an HCV 
DNA clone of the invention can be identified by four general approaches: (a) PGR 
amplification of die desired plasmid DNA or specific mRNA, (b) nucleic acid 
hybridization, (c) presence or absence of selection marker gene functions, (d) analysis with 
appropriate restriction endonucleases and (e) expression of inserted sequences. In the first 
approach, the nucleic acids can be aii^)lified by PGR to provide for detection of the 
amplified product. In the second approach, the presence of a foreign gene inserted in an 
expression vector can be detected by nucleic acid hybridization using probes comprising 
sequences that are homologous to the HCV DNA. In the third approach, the recombinant 
vector/host system can be identified and selected based upon the presence or absence of 
certain "selection marker" gene functions (e.g., P-galactosidase activity, thymidine kinase 
activity, resistance to antibiotics, transformation phenotype, occlusion body formation in 
baculovirus, etc.) caused by the insertion of foreign genes in the vector. In the fourth 
approach, recombinant expression vectors are identical by digestion with appropriate 
restriction enzymes. In the fifth approach, recombinant expression vectors can be identified 
by assaying for the activity, biochraiical, or immunological characteristics of the gene 
product expressed by the recombinant, e.g. , HCV RNA, HCV virions, or HCV viral 
proteins. 



For example, in a baculovirus expression systems, both non-fiision transfer vectors, such as 
but not limited to pVL941 (BomHI cloning site; Summers), pVLI393 (BamHI, Smal, Xbal, 
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EcoRl, Noth Xmam, BglU, and Pstl cloning site; Invitrogen), pVL1392 (BgHl, Pstl, Noil, 
Xmalll, £c(?RI, Xbal, Smal, and BanMl cloning site; Summers and Invitrogen), and 
pBlueBacIII (BamHI, BgUl, Pstl, ^col, and Hindlll cloning site, with blue/white 
recombinant screening possible; Invitrogen), and fusion transfer vectors, such as but not 
limited to pAc700 (BamYfl and Kpnl cloning site, in which the BamHl recognition site 
begins with the initiation codon; Summers), pAc701 and pAc702 (same as pAcTOO, with 
different reading frames), pAc360 {BamlU cloning site 36 base pairs downstream of a 
polyhedrin initiation codon; Invitrogen(195)), and pBlueBacHisA, B, C (three different 
reading frames, mth BamHl, BgKl, Pstl, Ncol, and Hindlll cloning site, an N-terminal 
peptide for ProBond purification, and blue/white recombinant screening of plaques; 
Invitrogen) can be used. 

Examples of mammalian expression vectors contemplated for use in the invention include 
vectors with inducible promoters, such as the dihydrofolate reductase (DHFR) promoter, 
e.g., any expression vector with a DHFR expression vector, or a £)//fK/methotrexate co- 
amplification vector, such as pED {Pstl, Sail, Sbal, Smal, and EcoRl cloning site, with the 
vector expressing both the cloned gene and DHFR; [see Kaufman, Current Protocols in 
Molecular Biology, 16.12 (1991)]. Alternatively, a glutamine synthetase/methionine 
sulfoximine co-amplification vector, such as pEE14 (Hindlll, Xbal, Smal, Sbal, EcoRl, and 
BcH cloning site, in which the vector expresses glutamine synthase and the cloned gene; 
Celltech). In another embodiment, a vector that directs episomal expression under control 
of Epstein Barr Virus (EBV) can be used, such as pREP4 (BamHl, Sfil, Xhol, Notl, Nhel, 
HindlU, Nhel, Pvull, and Kpnl cloning site, constitutive RSV-LTR promoter, hygromycin 
selectable marker; Invitrogen). pCEP4 (BamHl, Sfil, Xhol, Notl, Nhel, Hindlll, Nhel, 
PvuU, and Kpnl cloning site, constitutive hCMV immediate early gene, hygromycin 
selectable marker; Invitrogen). pMEP4 (Kpnl, Pvul, Nhel, Hindlll, Notl, Xhol, Sfil, 
BamHl cloning site, inducible methallothionein Ila gene promoter, hygromycin selectable 
marker: Invitrogen), pREP8 (BamHl, Xhol, Notl, Hindlll, Nhel, and Kpnl cloning site, 
RSV-LTR promoter, histidinol selectable marker; Invitrogen), pREP9 (Kpnl, Nhel, Hindlll, 
Notl, Xhol, Sfil, and BamHl cloning site, RSV-LTR promoter, G418 selectable marker; 
Invitrogen), and pEBVHis (RSV-LTR promoter, hygromycin selectable marker, N-terminal 
peptide purifiable via ProBond resin and cleaved by enterokinase; Invitrogen). Regulatable 
mammalian expression vectors, can be used, such as Tet and rTet [Gossen and Bujard, 
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Proc, Natl Acad. ScL USA 89:5547-51 (1992); Gossen et al. Science 268:1766-1769 
(1995)]. Selectable mammalian expression vectors for use in the invention include 
pRc/CMV {Hindllh BstXh Noth Sbah and Apal cloning site, G418 selection; Invitrogen), 
pRc/RSV (Hindnh Spel, BstXh Noth Xbal cloning site. G418 selection; Invitrogen), and 
others. Vaccinia virus mammalian expression vectors [see, Kaufman (1991) supra] for use 
according to the invention include but are not limited to pSCl 1 {Smal cloning site. TK- and 
p-gal selection), pMJ601 (&2/I. Smal A/a, Narh BspMU, BamUh Apal, Nhel Sacll, Kpnl, 
and ////idni cloning site; TK- and P-gal selection), and pTKgptFlS (£c(?RI, Psth Sail, Accl, 
Hmdlh Sbah 5amHI. and Hpa cloning site, TK or XPRT selection). 

Examples of yeast expression systems include the non-fiision pYES2 vector (X&al, Sphl, 
Shoh Notl GstXh EcoRh BstXh BamUh Sad KprA, and Hindill cloning sit; Invitrogen) 
or the fusion pYESHisA, B. C {Xbal, Sphl Shol, Notl BstXl, Ecom, BamUl, Sad Kpnh 
and Hindill cloning site, N-terminal peptide purified with ProBond resin and cleaved with 
enterokinase; Invitrogen), to mention just two, can be employed according to the invention. 

In addition, a host cell strain may be chosen which modulates the expression of the inserted 
sequences, or modifies and processes the gene product in the specific fashion desired. 
Different host cells have characteristic and specific mechanisms for the translational and 
post-translational processing and modification {e,g., glycosylation, cleavage [e.g., of signal 
sequence]) of proteins. Expression in yeast can produce a glycosylated product. 
Expression in eukaryotic cells can increase the likelihood of "native" glycosylation and 
folding of an HCV protein. Moreover, expression in manunalian cells can provide a tool 
for reconstituting, or constituting, native HCV virions or virus particle proteins. 

Furthermore, different vector/host expression systems may affect processing reactions, such 
as proteolytic cleavages, to a different extent. 

A variety of transfection methods, useful for other RNA vims studies, are enabled herein. 
Examples include microinjection, cell fusion, calcium-phosphatecationic liposomes such as 
lipofectin [Rice et aL, New Biol. 1:285-296 (1989); see "HCV-based Gene Expression 
Vectors^ zn^fl], DE-dextran [Rice et aL. J. Virol. 61: 3809-3819 (1987)], and 
electroporation [Bredenbeek etal,, 1 Virol. 67: 6439-6446 (1993); Liljestrom etal., J. 
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ViroL 65: 4107-41 13 (1991)]. Scrape loading [Kumar etaL, Biochem. MoL Biol InL 32: 
1059-1066 (1994)] and ballistic methods [Burkholder etaL, J. Immunol Metk 165: 
149-156 (1993)] may also be considered for cell types refractory to transfection by these 
other methods. A DNA vector transporter may be considered [see, e.g., Wu et ai, 1992, 
J. Biol. Chem. 267:963-967; Wu and Wu, 1988, J. Biol. Chem. 263:14621-14624; 
Hartmut ^a/.» Canadian Patent Application No. 2,012,311, filed March 15, 1990]. 

In Vitro Infection Widi HCV 
Identification of cell lines supporting HCV replication. An important aspect of the invention 
is a method it provides for developing new and more effective anti-HCV therapy by 
conferring the ability to evaluate the efficacy of different therapeutic strategies using an 
authentic and standardized in vitro HCV replication system. Such assays are invaluable 
before moving on to trials using rare and valuable experimental animals, such as the 
chimpanzee, or HCV-infected human patients. As mentioned in the Background of the 
Invention, at best only trace levels of HCV replication have been observed in cell culture 
and most of the systems reported are not amenable for drug screening or evaluation. The 
most promising system reported to date is the HTLVl-infected MT-2C T-lymphocyte 
subline, which has been shown to siq>port HCV replication with a signal: noise ratio of 
about 1000:1 [Mizutani etal., J. ViroL, 70: 7219-23 (1996)]. It should be noted, however, 
that replication in this system is initiated by infection with a patient inoculum. Such a 
system may have utility, but will be limited by differences between inocula which affect cell 
tropism and the detection of replication. 



The HCV infectious clone technology can be used to establish in vitro and in vivo systems 
for analysis of HCV replication and packaging. These include, but are not restricted to, (i> 
identification or selection of pennissive cell types (for RNA replication, virion assembly 
and release); (ii) investigation of cell culture parameters {e.g., varying culture conditions, 
cell activation, etc.) or selection of adaptive mutations that increase the efficiency of HCV 
rq)lication in cell cultures; and (iii) definition of conditions for efficient production of 
infectious HCV particles (either released into the culture supernatant or obtained afiter cell 
disruption). These and other readily apparent extensions of the invention have broad utility 
for HCV dierapeutic, vaccine, and diagnostic development. 
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General approaches for identifying permissive cell types are outlined below. Optimal 
methods for RNA transfection (see also, supra) vary with cell type and are determined 
using RNA reporter constructs. These include, for example, bicistronic RNAs [Wang et 
al., J. Virol 67: 3338^ (1993)] with the structure 5'-CAT-HCV IRES-LUC-3' which are 
used both to optimize transfection conditions (CAT; chloramphenicol acetyltransferase 
activity) and to determine if the cell type is permissive for HCV IRES-mediated translation 
(LUC; luciferase activity). For actual HCV RNA transfection experiments, cotransfection 
with a 5' capped luciferase reporter RNA [Wang etal., (1993) suprd\ provides an internal 
standard for productive transfection and translation. Examples of cell types potentially 
permissive for HCV replication include, but are not restricted to, primary human cells 
• {e.g. , hepatocytes, T-cells, B-cells, foreskin fibroblasts) as well as continuous human cell 
lines {e.g., HepG2, Huh7, HUT78, HPB-Ma, MT-2. MT-2C, and other HTLV-l and 
HTLV-n infected T-cell lines, Namalawa, Daudi, EBV-transformed LCLs). In addition, 
cell lines of other species, especially diose which are readily transfected widi RNA and 
permissive for replication of flaviviruses or pestiviruses {e.g., SW-13, Vero, BHK-21, 
COS, PK-15, MBCK, etc.), can be tested. Cells are transfected using a method as 
described supra. 

For replication assays, RNA transcripts are prepared using the functional clone and the 
corresponding non-functional, e.g., aGDD (see Examples) derivative, is used as a negative 
control for persistence of HCV RNA and antigen in the absence of productive replication. 
Template DNA (which complicates later analyses) is removed by repeated cycles of DNasel 
treatment and acid phenol extraction followed by purification by either gel electrophoresis 
or gel filtration (less than one molecule of amplifiable DNA per 10' molecules of transcript 
RNA). DNA-free RNA transcripts will be mixed with LUC reporter RNA and used to 
transfect cell cultures using optimal conditions determined above. After recovery of the 
cells, RNaseA is added to the media to digest excess input RNA and the cultures incubated 
for various periods of time. An early timepoint ( - 1 day post-transfection) will be 
harvested and analyzed for LUC activity (to verify productive transfection) and positive- 
strand RNA levels in the cells and supernatant (as a baseline). Samples are collected 
periodically for 2-3 weeks and assayed for positive-strand RNA levels by QC-RT/PCR [see 
Kolykhalov et ai, (1996) supra]. Cell types showing a clear and reproducible difference 
between the intact infectious transcript and the non-functional derivative, e.^., aGDD 
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deletion, control can be subjected to more thorough analyses to verify authentic replication. 
Such assays include measurement of negative-sense HCV RNA accumulation by QC- 
RT/PCR [Gunji etal., (1994) supra; Lanford etai, Virology 202: 606-14 (1994)], 
Northern-blot hybridization, or metabolic labeling [Yoo et al, (1995) suprd\ and single cell 
methods, such as in situ hybridization [ISH; Gov^ans et aL, In 'TMucleic Acid Probes" (R. 
H. Symons, Eds.), Vol. pp. 139-158. CRC Press, Boca Raton. (1989)], in situ PGR [followed 
by ISH to detect only HCV-specific amplification products; Haase et al, Proc, Natl Acad. 
Set USA 87: 4971-4975 (1990)], and immunohistochemistry. 

HCV particles for studying virus-receptor interactions. In combination with the 
identification of cell lines which are permissive for HCV infection and replication, defined 
HCV stocks produced using the infectious clone technology can be used to evaluate the 
interaction of the HCV with cellular receptors. Assays can be set up which measure 
binding of the virus to susceptible cells or productive infection, and then used to screen for 
inhibitors of these processes. 

Identification of cell lines for characterization of HCV receptors. Cell lines permissive for 
HCV RNA replication, as assayed by RNA transfection, can be screened for their ability to 
be infected by the virus. Cell lines permissive for RNA replication but which cannot be 
infected by die homologous virus may lack one or more host receptors required for HCV 
binding and entry. Such cells provide valuable tools for (i) functional identification and 
molecular cloning of HCV receptors and co-receptors; (ii) characterization of virus-receptor 
interactions; and (iii) developing assays to screen for compounds or biologies {e.g., 
antibodies, SELEX RNAs [Bartel and Szostak, In "RNA-protein interactions" (K. Nagai 
and I. W. Mattaj, Eds.), Vol. pp. 82-102. IRL Press, Oxford (1995); Gold et al, Annu. Rev, 
Biochem. 64: 763-797 (1995)], etc.) that inhibit these interactions. 

Once defined in this manner, these HCV receptors serve not only as dierapeutic targets but 
may also be expressed in transgenic animals rendering them susceptible to HCV infection 
[Koike etal, Dev Biol Stand 78: 101-7 (1993); Ren and Racaniello, J Virol 66: 296-304 
(1992)]. Such transgenic animal models supporting HCV replication and spread have 
important applications for evaluating anti-HCV drugs. 
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The ability to manipulate the HCV glycoprotein structure using infectious clone technology, 
or by genetic manipulations as described supra, may also be used to create HCV variants 
with altered receptor specificity. In one example, HCV glycoproteins can be modified to 
express a heterologous binding domain for a known cell surface receptor. The approach 
should allow the engineering of HCV derivatives with altered tropism and perhaps extend 
infection to non-chimeric small animal models. 

Alternative approaches for identifying permissive cell lines. Besides using the unmodified 
HCV RNA transcripts derived from functional clones, these functional HCV clones can be 
engineered to provide selectable markers for HCV replication. For instance, genes 
encoding dominant selectable markers can be expressed as part of the HCV polyprotein, or 
as separate cistrons located in permissive regions of the HCV RNA genome. Such 
engineered derivatives [see Bredenbeek and Rice. Semin. Virol 3: 297-310 (1992) for 
review] have been successfully constructed for other RNA viruses such as Sindbis virus 
[Frolov et al, Proc. Natl Acad. Set U.S.A. 93: 1 1371-1 1377 (1996)] or the flavivirus 
Kunjin [Khromykh and Westaway, J. Virol 71:1497-1505 (1997)]. Examples of 
selectable markers for mammalian cells include, but are not limited to. the genes encoding 
dihydrofolate reductase (DHFR; methotrexate resistance), thymidine kinase (tk; 
methotrexate resistance), puromycin acetyl transferase (pac; puromycin resistance), 
neomycin resistance (neo; resistance to neomycin or G418). mycophenolic acid resistance 
(gpt), hygromycin resistance, and resistance to zeocin. Other selectable markers can be 
used in different hosts such as yeast (wraS, his3, leul, trp\). Strategies for functional 
expression of heterologous genes have been described [see Bredenbeek and Rice, (1992) 
supra for review]. Examples include (Figure 2): (i) in-frame insertion into the viral 
polyprotein with cleavage(s) to produce the selectable marker protein mediated by cellular 
or viral proteases; (ii) creation of separate cistrons using engineered translational start and 
stop signals. Examples include, but are not restricted to, die use of internal ribosome entiy 
site (IRES) RNA elements derived from cellular or viral mRNAs [Jang et al. Enzyme 44: 
292-309 (1991); Macejak and Samow, Nature 353: 90-94 1991); MoUa e/a/.. Nature 
356: 255-257 (1992)]. In a particular manifestation, a cassette including the EMCV IRES 
element and the neomycin resistance gene is inserted in the HCV H77 3' NTR 
hypervariable region. Transcribed RNAs are used to transfect human hepatocyte or odier 
cell lines and the antibiotic G418 used for selecting resistant cell populations. In one 



wo 98/39031 



PCTAJS98/04428 



52 

manifestation of this approach, transcripts from pHCVFiyS'EMCVIRESneo {ir^a) are 
used to transfect a variety of different cell lines. 

Alterations of the HCV cDNA can be made to produce lines expressing convenient 
assayable markers as indirect indicators of HCV replication. Such self-replicating RNAs 
might include the entire HCV genome RNA or RNA replicons, where regions non-essential 
for RNA replication have been deleted. Assayable genes might include a second dominant 
selectable marker, or those encoding proteins with convenient assays. Examples include, 
but are not restricted to, P-galactosidase, (i-glucuronidase, firefly or bacterial iuciferase, 
green fluorescent protein (GFP) and humanized derivatives thereof, cell surface markers, 
and secreted markers. Such products are either assayed directly or may activate the 
expression or activity of additional reporters. 

Animal Models for HCV Infection and Replication 
In addition to chimpanzees, the present invention permits development of alternative animal 
models for studying HCV replication and evaluating novel therapeutics. Using the 
authentic HCV cDNA clones described in this invention as starting material, multiple 
approaches can be envisioned for estabhshing alternative animal models for HCV 
replication. In one manifestation, well-defined HCV stocks, produced by transfection of 
chimpanzees or by replication in cell culture, could be used to inoculate immunodeficient 
mice harboring human tissues capable of supporting HCV replication. An example of this 
art is the SCID:Hu mouse, where mice with a severe combined immunodeficiency are 
engrafted with various human (or chimpanzee) tissues, which could include, but are not 
limited to, fetal liver, adult liver, spleen, or peripheral blood mononuclear cells. Besides 
SCID mice, normal irradiated mice can serve as recipients for engraftment of human or 
chimpanzee tissues. These chimeric animals would then be substrates for HCV replication 
after either ex vivo or in vivo infection with defined virus-containing inocula. 

In another manifestation, adaptive mutations allowing HCV replication in alternative species 
may produce variants which will be permissive for replication in these animals. For 
instance, adaptation HCV for replication and spread in either continuous rodent cell lines or 
primary tissues (such as hepatocytes) enables the virus to replication in small rodent 
models. Alternatively, complex libraries of HCV variants created by chemical or biological 
[Stemmer, Proc, Natl Acad, ScL USA 91:10747 (1994)] methods can be created and used 
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for inoculation of potentially susceptible animals. Such animals could be either 
immunocompetent or immunodeficient, as described above. Variants capable of replication 
can be isolated, molecularly cloned and then the adaptive mutations incorporated into a fiiU- 



The functional activity of HCV can be evaluated transgenically. In this respect, a 
transgenic mouse model can be used [see, e.g. , Wilmut et ai, Experientia 47:905 (1991)]. 
The HCV RNA or DNA clone can be used to prepare transgenic vectors, including viral 
vectors, or cosmid clones (or phage clones). Cosmids may be introduced into transgenic 
mice using published procedures [Jaenisch. Science, 240: 1468-1474 (1988)]. In the 
preparation of transgenic mice, embryonic stem cells are obtained from blastocyst embryos 
[Joyner, In Gene Targeting: A Practical Approach, The Practical Approach Series, 
Rickwood. D., and Hames, B, D., Eds., IRL Press: Oxford (1993)] and transfected with 
HCV DNA or RNA. Transfected cells are injected into early embryos, e.g. , mouse 
embryos, as described [Hammer et a/.. Nature 315:680 (1985); Joyner. supra]. Various 
techniques for preparation of transgenic animals have been described [U.S. Patent No. 
5,530,177, issued June 25, 1996; U.S. Patent No. 5,898,604, issued December 31, 1996]. 
Of particular interest are transgenic animal models in which the phenotypic or pathogenic 
effects of a transgene are studied. For example, the effects of a rat phosphoenolpyruvate 
carboxykinase-bovine growth hormone fusion gene has been studied in pigs [Wieghart et 
aL, J. Reprod. Pert., Suppl. 41:89-96 (1996)]. Transgenic mice that express of a gene 
encoding a hxunan amyloid precursor protein associated with Alzheimer's disease are used 
to study this disease and other disorders [International Patent Publication WO 96/06927, 
published March 7. 1996; Quon et al.. Nature 352:239 (1991)]. Transgenic mice have also 
been created for the hepatitis delta agent [Polo et aL, J. Virol 69:5203 (1995)] and for 
hepatitis B virus (Chisar, Curr. Top. Microbiol Immunol 206:149 (1996)]. and replication 
occurs in these engineered animals. 

Thus, the functional cDNA clones described here, or parts thereof, can be used to create 
transgenic models relevant to HCV replication and pathogenesis. In one example, 
transgenic animals harboring the entire HCV genome can be created. Appropriate 
constructs for transgenic expression of the entire HCV genome in a transgenic mouse of the 
invention could include a nuclear promoter engineered to produce transcripts with the 



length clone, which is functional for replication in the selected non-human species. 
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appropriate S' terminus, the full-length HCV cDNA sequence, a cis-cleaving delta 
ribozyme [Ball, /. VtroL 66: 2335-2345 (1992); Pattnaik et al.. Cell 69: 101 1-1020 
(1992)] to produce an authentic 3' terminus, followed possibly by signals that promote 
proper nuclear processing and transport to the cytoplasm (where HCV RNA replication 
occurs). Besides the entire HCV genome, animals can been engineered to express 
individual or various combinations of HCV proteins and RNA elements. For example, 
animals engineered to express an HCV gene product or reporter gene under the control of 
the HCV IRES can be used to evaluate therapies directed against this specific RNA target. 
Similar animal models can be envisioned for most known HCV targets. 

Such alternative animal models are useful for (i) studying the effects of different antiviral 
agents on HCV replication in a whole animal system; (ii) examining potential direct 
cytotoxic effects of HCV gene products on hepatocytes and other cell types, defining the 
underlying mechanisms involved, and identifying and testing strategies for therapeutic 
intervention; and (iii) studying immune-mediated mechanisms of cell and tissue damage 
relevant to HCV pathogenesis and identifying and testing strategies for interfering with 
these processes. 



Cell lines and animal models supporting HCV replication can be used to examine the 
emergence of HCV variants with resistance to existing and novel therapeutics. Like all 
RNA viruses, the HCV replicase is presumed to lack proofreading activity and RNA 
replication is therefore error prone, giving rise to a high level of variation [Bukh et aL, 
(1995) supra]. The variability manifests itself in the infected patient over time and in the 
considerable diversity observed between different isolates. The emergence of drug-resistant 
variants is likely to be an important consideration in the design and evaluation of HCV 
mono and combination therapies. HCV replication systems of the invention can be used to 
study the emergence of variants under various therapeutic formulations. These might 
include monotherapy or various combination therapies (e.g., IFN-a, ribavirin, and new 
antiviral compounds). Resistant mutants can then be used to define the molecular and 
structural basis of resistance and to evaluate new therapeutic formulations, or in screening 
assays for effective anti-HCV drugs {infra). 



Selection and Analvsis of Drug-Resistant Variants 



Screening Fpr Antj-HCV Agent? 
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HCV-permissive cell lines or animal models (preferably rodent models) can be used to 
screen for novel inhibitors or to evaluate candidate anti-HCV therapies. Such therapies 
include, but would not be limited to, (i) antisense oligonucleotides or ribozymes targeted to 
conserved HCV RNA targets; (ii) injectable compounds capable of inhibiting HCV 
replication; and (iii) orally bioavailable compounds capable of inhibiting HCV replication. 
Targets for such formulations include, but are' not restricted to, (i) conserved HCV RNA 
elements important for RNA replication and RNA packaging; (ii) HCV-encoded enzymes; 
(iii) protein-protein and protein-RNA interactions important for HCV RNA replication, 
virus assembly, virus release, viral receptor binding, viral entry, and initiation of viral 
RNA replication; (iv) virus-host interactions modulating the ability of HCV to establish 
chronic infections; (v) virus-host interactions modulating the severity of liver damage, 
including factors affecting apoptosis and hepatotoxicity; (vi) virus^host interactions leading 
to the development of more severe clinical outcomes including cirrhosis and hepatocellular 
carcinoma; and (vii) virus-host interactions resulting in other, less frequent, HCV- 
associated human diseases. 

Evaluation of antisense and ribozyme therapies. The present invention extends to the 
preparation of antisense nucleotides and ribozymes that may be tested for the ability to 
interfere with HCV replication. This approach utilizes antisense nucleic acid and 
ribozymes to block translation of a specific mRNA, either by masking that mRNA with an 
antisense nucleic acid or cleaving it with a ribozyme. 

Antisense nucleic acids are DNA or RNA molecules that are complementary to at least a 
portion of a specific mRNA molecule [see Marcus-Sekura, Anal Biochem. 172:298 
(1988)]. In the cell, they hybridize to that mRNA, forming a double stranded DNA:RNA 
or RNA:RNA molecule. The cell does not translate an mRNA in this double-stranded 
fonn. Therefore, antisense nucleic acids interfere with the expression of mRNA into 
protein. Oligomers of about fifteen nucleotides and molecules that hybridize to the AUG 
initiation codon will be particularly efficient, since they are easy to synthesize and are likely 
to pose fewer problems than larger molecules when introducing them into organ cells. 
Antisense methods have been used to inhibit the expression of many genes in vitro 
[Marcus-Sekura, 1988, supra; Hambor et aL /. Exp, Med. 168:1237 (1988)1, Preferably 
synthetic antisense nucleotides contain phosphoester analogs, such as phosphorothiolates, or 
tfaioesters, rather than natural phophoester bonds. Such phosphoester bond analogs are 
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more resistant to degradation, increasing the stability, and tlierefore the efficacy, of the 
antisense nucleic acids. 



In the genetic antisense approach, expression of the wild-type allele is suppressed because 
of expression of antisense RNA. This technique has been used to inhibit TK synthesis in 
tissue culture and to produce phenotypes of the Kruppel mutation in Drosophila, and the 
Shiverer mutation in mice [Izant etai, Cell, 36:1007-1015 (1984); Green etaL, Arum. 
Rev. Biochem., 55:569-597 (1986); Katsuki et al., Science, 241:593-595 (1988)]. An 
important advantage of this approach is that only a small portion of the gene need be 
expressed for effective inhibition of expression of the entire cognate mRNA. The antisense 
transgene will be placed under control of its own promoter or another promoter expressed 
in the correct cell type, and placed upstream of the SV40 polyA site. 

Ribozymes are RNA molecules possessing the ability to specifically cleave other single 
stranded RNA molecules in a manner somewhat analogous to DNA restriction 
endonucleases. Ribozymes were discovered from the observation that certain mRNAs have 
the ability to excise their own introns. By modifying the nucleotide sequence of these 
RNAs, researchers have been able to engineer molecules that recognize specific nucleotide 
sequences in an RNA molecule and cleave it [Cech, J. Am. Med. Assoc. 260:3030 (1988)]. 
Because they are sequence-specific, only mRNAs with particular sequences are inactivated. 



Investigators have identified two types of ribozymes, Tetrahymena-type and 
"hammerhead"-type. Tetrahymena-iypt ribozymes recognize four-base sequences, while 
"hammerhead"-type recognize eleven- to eighteen-base sequences. The longer the 
recognition sequence, the more likely it is to occur exclusively in the target MRNA species. 
Therefore, hammerhead-type ribozymes are preferable to Tetrahymena-typt ribozymes for 
inactivating a specific mRNA species, and eighteen base recognition sequences are 
preferable to shorter recognition sequences. 

Screening compound libraries for anti-HCV activity. Various natural product or synthetic 
libraries can be screened for anti-HCV activity in the in vitro or in vivo models provided by 
the invention. One approach to preparation of a combinatorial library uses primarily 
chemical methods, of which the Geysen method [Geysen etal.. Molecular Immunology 
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23:709-715 (1986); Geysen et alJ, Immunologic Method 102:259-274 (1987)] and the 
method of Fodor et aL[Science 251:767-773 (1991)] are examples. Furka et al.[14th 
International Congress of Biochemistry, Volume 5, Abstract FR:013 (1988); Furka, Int. 7. 
Peptide Protein Res. 37:487-493 (1991)], Houghton [U.S. Patent No. 4,631.211. issued 
December 1986] and Rutter etaL[V.S, Patent No. 5,010.175, issued April 23, 1991] 
describe methods to produce a mixture of peptides that can be tested for anti-HCV activity. 

In another aspect, synthetic libraries [Needels etaL, Proc. NatL Acad. ScL USA 90:10700- 
4 (1993); Ohlmeyer et al., Proc. Natl. Acad, Sci. USA 90:10922-10926 (1993); Lam et al.. 
International Patent Publication No. WO 92/00252; Kocis etaL, International Patent 
Publication No. WO 9428028], and the like can be used to screen for anti-HCV compounds 
according to the present invention. These references, describe adaption of the library 
screening techniques in biological assays. 

Defined/engineered HCV virus particles for neutralization assays. The functional clones 
described herein can be used to produce defined stocks of HCV-H particles for infectivity 
and neutralization assays. Homogeneous stocks can be produced in die chimpanzee model, 
in cell culture systems, or usmg various heterologous expression systems (e.g., baculovirus, 
yeast, manimalian cells; see supra). As described above, besides homogenous virtis 
preparations of HCV-H, stocks of other genotypes or isolates can be produced. These 
stocks can be used in cell culture or in vivo assays to define molecules or gene therapy 
approaches capable of neutralizing HCV particle production or infectivity. Examples of 
such molecules include, but are not restricted to, polyclonal antibodies, monoclonal 
antibodies, artificial antibodies with engineeredyoptimized specificity, single-chain 
antibodies (see the section on antibodies, infra), nucleic acids or derivatized nucleic acids 
selected for specific binding and neutralization, small orally bioavailable compounds, etc. 
Such neutralizing agents, targeted to conserved viral or cellular targets, can be either 
genotype or isolate-specific or broadly cross-reactive. They could be used either 
prophylactically or for passive immunotherapy to reduce viral load and perhaps increase the 
chances of more effective treatment in combination with other antiviral agents {e.g,, IFN-a, 
ribavirin, etc). Directed manipulation of HCV infectious clones can also be used to 
produce HCV stocks with defined changes in the glycoprotein hypervariable regions or in 
other epitopes to study mechanisms of antibody neutralization, CTL recognition, inunune 
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escape and immune enhancement. These studies will lead to identification of other virus- 
specific functions for anti-viral therapy. 

Dissection of HCV Replication 
Other HCV replication assays. For the first time, this invention allows du-ected molecular 
genetic dissection of HCV replication. Such analyses are expected to (i) validate antiviral 
targets which are currently being pursued; and (ii) uncover unexpected new aspects of HCV 
replication amenable to therapeutic intervention. Targets for immediate validation through 
mutagenesis studies include the following: the 5' NTR, the HCV polyprotein and cleavage 
products, and the 3' NTR. As described above, analyses using the infectious clone 
technology and permissive cell cultures can be used to compare parental and mutant 
replication phenotypes after transfection of cell culmres with infectious RNA. Even though 
RT-PCR allows sensitive detection of viral RNA accumulation, mutations which decrease 
the efficiency of RNA replication may be difficult to analyze, unless conditional mutations 
are recovered. As a complement to first cycle analyses, rra/i9-complementation assays can 
be used to facilitate analysis of HCV mutant phenotypes and inhibitor screening. 
Heterologous systems (vaccinia, Sindbis, or non-viral) can be used to drive expression of 
the HCV RNA replicase proteins and/or packaging machinery [see Lemm and Rice, J. 
Virol 67: 1905-1915 (1993a); Lemm and Rice, 1 Virol 67: 1916-1926 (1993b); Lemm 
etal.EMBOJ. 13: 2925-2934 (1994); Li era/.. J. f^/ro/. 65:6714-6723 (1991)]. If these 
elements are capable of functioning in trans, then co-expression of RNAs with appropriate 
cw-elements should result in RNA replication/packaging. Such systems therefore mimic 
steps in authentic RNA replication and virion assembly, but uncouple production of viral 
components fi'om HCV replication. If HCV replication is somehow self-limiting, 
heterologous systems may drive significantly higher levels of RNA replication or particle 
production, facilitating analysis of mutant phenotypes and antiviral screening. A third 
approach is to devise cell-free systems for HCV template-dependent RNA replication. A 
coupled translation/replication and assembly system has been described for poliovirus in 
HeLa cells [Barton and Flanegan, J. Virol 67: 822-831 (1993); Molla et ai. Science 254: 
1647-1651 (1991)], and a template-dependent in vitro assay for initiation of negative-strand 
synthesis has been established for Sindbis virus. Similar in vitro systems for HCV are 
invaluable for studying many aspects of HCV replication as well as for inhibitor screening 
and evaluation. An example of each of these strategies follows. 
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Trans-complementation ofHCVRNA replication and/or packaging using viral or non-viral 
expression systems. Heterologous systems can be used to drive HCV replication. For 
example, the vaccinia/T7 cytoplasmic expression system has been extremely useful for 
trans-complementation of RNA vims replicase and packaging functions [see Ball, (1992) 
supra; Lemm and Rice, (1993a) supra; Lemm and Rice, (1993b) supra; Lemm etai, 
(1994) supra; Pattnaik et aL, (1992) si^ra; Pattnaik et ai, Virology 206: 760-4 (1995); 
Porter et ai, J. Virol. 69: 1 548-1555 (1995)]. In brief, a vaccinia recombinant (vTF7-3) is 
used to express T7 RNA polymerase (TTRNApol) in the cell type of interest. Target 
cDNAs, positioned downstream from the T7 promoter, are delivered either as vaccinia 
recombinants or by plasmid transfection. This system leads to high level RNA and protein 
expression. A variation of this approach, which obviates the need for vaccinia (which 
could interfere with HCV RNA replication or virion formation), is the pT7T7 system where 
the T7 promoter drives expression of T7RNApol [Chen et al. Nucleic Acids Res. 22: 
21 14-2120. (1994)]. pT7T7 is mixed with T7RNApol (the protein) and co-transfected with 
the T7-driven target plasmid of interest. Added T7RNApol initiates transcription, leading 
to it own production and high level expression of the target gene. Using either approach, 
RNA transcripts with precise 5' and 3' termini can be produced using the T7 transcription 
start site (5') and the cis-cleaving HCV ribozyme (Rz) (3') [Ball, (1992) supra; Pattnaik et 
aL, il992) supra]. 

These or similar expression systems can be used to establish assays for HCV RNA 
replication and particle formation, and for evaluation of compounds which might inhibit 
these processes. In another extension of the HCV functional clone technology, T7-driven 
protein expression constructs and full-length HCV clones incorporating the HCV ribozyme 
following the 3' NTR are used. A typical experimental plan to validate the assay is 
described for pT7T7, although essentially similar assays can be envisioned using vTF7-3 or 
cell lines expressing the T7 RNA polymerase. HCV-permissive cells are co-transfected 
wifli pT7T7-f T7RNApol+p90/HCVFLlong pU Rz (or a negative control, such as aGDD). 
At different times post-transfection, accumulation of HCV proteins and RNAs, driven by 
the pT7T7 system, are followed by Western and Northern blotting, respectively. To assay 
for HCV-specific replicase function. Act. D is added to block DNA-dependent T7 
transcription [Lemm and Rice, (1993a). supra] and Act. D-resistant RNA synthesis is 
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monitored by metabolic labeling. Radioactivity will be incorporated into full-length HCV 
RNAs for p90/HCVFL long pU/Rz, but not for p90/HCVFLaGDD/Rz. This assay system, 
or elaborated derivatives, can be used to screen for inhibitors and to study their effects on 
HCV RNA replication. 

Cell-free systems for assaying HCV replication and inhibitors thereof Cell-free assays for 
studying HCV RNA replication and inhibitor screening can also be established using the 
functional cDNA clones described in this invention. Either virion or transcribed RNAs are 
used as substrate RNA. For HCV, full-length HCV RNAs transcribed in vitro can be used 
to program such in vitro systems and replication assayed essentially as described for 
poliovinis [see Barton et at., (1995) supra]. In case hepatocyte-specific or other factors 
are required for HCV RNA replication, the system can be supplemented with hepatocyte or 
other cell extracts, or alternatively, a comparable system can be established using cell lines 
which have been shown to be permissive for HCV replication. 

One concern about this approach is that proper cell-free synthesis and processing of the 
HCV polyprotein must occur. Sufficient quantities of properly processed replicase 
components may be difficult to produce. To circumvent this problem, the T7 expression 
system can be used to express high levels of HCV replicase components in appropriate cells 
[see Lemm et at,, (1997) supra]. P15 membrane fractions from these cells (with added 
buffer, Mg^"*^, an ATP regenerating system, and NTPs) should be able to initiate and 
synthesize fiill-iength negative-strand RNAs upon addition of HCV-specific template RNAs. 

Establishment of either or both of these assays allows rapid and precise analysis of the 
effects of HCV mutations, host factors, involved in replication and inhibitors of the various 
steps in HCV RNA replication. These systems will also establish the requirements for 
helper systems for preparing replication-deficient HCV vectors. 

Vaccination and Protective Immunity 
There are still many unknown parameters that impact on development of effective HCV 
vaccines. It is clear in both man and the chimpanzee that some individuals can clear the 
infection. Also, 10-20% of those treated with IFN appear to show a sustained response as 
evidenced by lack of circulating HCV RNA. Other studies have shown a lack of protective 
immunity, as evidenced by successful reinfection with both homologous virus as well as 
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with more distantly related HCV types [Farci et aL, (1992) supra\ Prince et aL, (1992) 
supra]. Nonetheless, chimpanzees immunized with subunit vaccines consisting of E1E2 
oligomers and vaccinia recombinants expressing these proteins are partially protected 
against low dose challenges [Choo et ai, Proc. natl. Acad, Sci. USA 91:1294 (1994)]. The 
infectious clone technology described in this invention has utility not only for basic studies 
aimed at understanding the nature of protective immune responses against HCV. but also 
for novel vaccine production methods. 

Active immunity against HCV can be induced by immunization (vaccination) with an 
immunogenic amount of an attenuated or inactivated HCV virion, or HCV virus particle 
proteins, preferably with an inmiunologically effective adjuvant. An "immunologically 
effective adjuvant" is a material that enhances the immune response. 

Selection of an adjuvant depends on the subject to be vaccinated. Preferably, a 
pharmaceutically acceptable adjuvant is used. For example, a vaccine for a human should 
avoid oil or hydrocarbon emulsion adjuvants, including complete and incomplete Freund's 
adjuvant. One example of an adjuvant suitable for use with humans is alum (alumina gel). 
A vaccine for an animal, however, may contain adjuvants not appropriate for use with 
humans. 

An alternative to a traditional vaccine comprising an antigen and an adjuvant involves the 
direct in vivo introduction of DNA or RNA encoding the antigen into tissues of a subject for 
expression of the antigen by the cells of the subject*s tissue. Such vaccines are termed 
herein "DNA vaccines," "genetic vaccination," or "nucleic acid-based vaccines." Methods 
of transfection as described above, such as DNA vectors or vector transporters, can be used 
for DNA vaccines. 

DNA vaccines are described in International Patent Publication WO 95/20660 and 
International Patent Publication WO 93/19183. the disclosures of which are hereby 
incorporated by reference in their entireties. The ability of directly injected DNA that 
encodes a viral protein or genome to elicit a protective immune response has been 
demonstrated in numerous experimental systems [Conry et aL, Cancer Res. ^ 54:1164-1168 
(1994); Cox etaL, Virol, 67:5664-5667 (1993); Davis etal.. Hum. Mole. Genet., 2:1847- 
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1851 (1993); Sedegah etaL, Proc. Natl, Acad, ScL, 91:9866-9870 (1994); Montgomery et 
al., DNA CeU Bio., 12:777-783 (1993); Ulmer etaL, Science, 259:1745-1749 (1993); 
Wang etaL, Proc. Natl. Acad. Sci., 90:4156-4160 (1993); Xiang etal., Virology, 199:132- 
140 (1994)]. Studies to assess this strategy in neutralization of influenza virus have used 
both envelope and internal viral proteins to induce the production of antibodies, but in 
particular have focused on the viral hemagglutinin protein (HA) [Fynan et ai, DNA Cell 
Biol, 12:785-789 (1993A); Fynan etal., Proc. Natl Acad, Sd„ 90:11478-11482 (1993B); 
Robinson etal, Vaccine, 11:957, (1993); Webster etal. Vaccine, 12:1495-1498 (1994)]. 

Vaccination through directly injecting DNA or RNA that encodes a protein to elicit a 
protective immune response produces both cell-mediated and humoral responses. This is 
analogous to results obtained with live viruses [Raz et al, Proc. Natl Acad, Sci„ 91:9519- 
9523 (1994); Ulmer, 1993, supra\ Wang, 1993, supra; Xiang, 1994, jMpra]. Studies with 
ferrets indicate that DNA vaccines against conserved internal viral proteins of influenza, 
together with surface glycoproteins, are more effective against antigenic variants of 
influenza vims than are either inactivated or subvirion vaccmes [Donnelly et al, 
NatMedicine, 6:583-587 (1995)]. Indeed, reproducible immune responses to DNA 
encoding nucleoprotein have been reported in mice that last essentially for the lifetime of 
the animal [Yankauckas etal, DNA Cell Biol, 12: 771-776 (1993)]. 

A vaccine of the invention can be administered via any parenteral route, including but not 
limited to intramuscular, intraperitoneal, intravenous, intraarterial {e.g., hepatic artery) and 
the like. Preferably, since the desired result of vaccination is to elucidate an immune 
response to HCV, administration directly, or by targeting or choice of a viral vector, 
indirectly, to lymphoid tissues, e.g., lymph nodes or spleen. Since immune cells are 
continually replicating, they are ideal target for retroviral vector-based nucleic acid 
vaccines, since retroviruses require replicating cells. 

Passive immunity can be conferred to an animal subject suspected of suffering an infection 
with HCV by administering antiserum, neutralizing polyclonal antibodies, or a neutralizing 
monoclonal antibody against HCV to the patient. Altiiough passive immunity does not 
confer long term protection, it can be a valuable tool for the treatment of an acute infection 
of a subject who has not been vaccinated. Preferably, the antibodies administered for 
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passive immune therapy are autologous antibodies. For example, if the subject is a human, 
preferably the antibodies are of human origin or have been "humanized," in order to 
minimize the possibility of an immune response against the antibodies. In addition, genes 
encoding neutralizing antibodies can be introduced in vectors for expression in vivo, e,g. , 
in hepatocytes. 

Antibodies for passive immune therapy. Preferably, HCV virions or virus particle proteins 
prepared as described above are used as an immunogen to generate antibodies that 
recognize HCV. Such antibodies include but are not limited to polyclonal, monoclonal, 
chimeric, single chain. Fab fragments, and an Fab expression library. Various procedures 
known in the art may be used for the production of polyclonal antibodies to HCV. For the 
production of antibody, various host animals can be immunized by injection with the HCV 
virions or polypeptide, e.g., as describe infra, including but not limited to rabbits, mice, 
rats, sheep, goats, etc. Various adjuvants may be used to increase the immunological 
response, depending on the host species, including but not limited to Freund's (complete 
and incomplete), mineral gels such as aluminum hydroxide, surface active substances such 
as lysolecithin, pluronic polyols. polyanions, peptides, oil emulsions, keyhole limpet 
hemocyanins, dinitrophenol, and potentially useful human adjuvants such as BCG (badlle 
Calmeae-Guerin) and Corynebacterium parvum. 

For preparation of monoclonal antibodies directed toward HCV as described above, any 
technique that provides for the production of antibody molecules by continuous cell lines in 
culture may be used. These include but are not limited to the hybridoma technique 
originally developed by KoWer and Milstein [Nature 256:495-497 (1975)], as well as the 
trioma technique, the human B-cell hybridoma technique [Kozbor et al., Imnmnology Today 
4:72 1983); Cote et ai, Proc. Natl Acad, Sci. U.S.A. 80:2026-2030 (1983)], and the EBV- 
hybridoma technique to produce human monoclonal antibodies [Cole et al., in Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96 (1985)]. In an additional 
embodiment of the invention, monoclonal antibodies can be produced in germ-free animals 
[International Patent Publication No. WO 89/12690. published 28 December 1989]. In 
fact, according to the invention, techniques developed for the production of ''chimeric 
antibodies" [Morrison etal, J, Bacterial. 159:870 (1984); Neuberger etai. Nature 
312:604-608 (1984); Takeda et al.. Nature 314:452-454 (1985)] by splicing the genes from 
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a mouse antibody molecule specific for HCV together with genes from a human antibody 
molecule of appropriate biological activity can be used; such antibodies are within the scope 
of this invention. Such human or humanized chuneric antibodies are preferred for use in 
therapy of human diseases or disorders (described infra), since the human or humanized 
antibodies are much less likely than xenogenic antibodies to induce an immune response, in 
particular an allergic response, themselves. 

According to the invention, techniques described for the production of single chain 
antibodies [U.S. Patent Nos. 5,476,786 and 5,132,405 to Huston; U.S. Patent 4.946,778] 
can be adapted to produce HCV-specific single chain antibodies. An additional 
embodiment of the invention utilizes the techniques described for the construction of Fab 
expression libraries [Huse et aL, Science 246:1275-1281 (1989)] to allow rapid and easy 
identification of monoclonal Fab fragments with the desired specificity. 

Antibody fragments which contain the idiotype of the antibody molecule can be generated 
by known techniques. For example, such fragments include but are not limited to: the 
F(ab')2 fragment which can be produced by pepsin digestion of the antibody molecule; the 
Fab' fragments which can be generated by reducing the disulfide bridges of the F{ab')2 
fragment, and the Fab fragments which can be generated by treating the antibody molecule 
with papain and a reducing agent. 

HCV particles for subunit vaccination. The functional HCV-H cDNA clone, and similarly 
constructed and verified clones for other genotypes, can be used to produce HCV-like 
particles for vaccination. Proper glycosylation, folding, and assembly of HCV particles 
may be important for producing appropriately antigenic and protective subunit vaccines. 
Several methods can be used for particle production. They include engineering of stable 
cell lines for inducible or constitutive expression of HCV-like particles (using bacterial, 
yeast or mammalian cells), or the use of higher level eukaryotic heterologous expression 
systems such as recombinant baculovuruses, vaccinia viruses [Moss, Proc. Natl Acad, ScL 
USA, 93: 11341-11348 (1996)], or alphavimses [Frolov etal., (1996) supra], HCV 
particles for immunization may be purified from eidier the media or disrupted cells, 
dependmg upon their localization. Such purified HCV particles or mixtures of particles 
representing a spectrum of HCV genotypes, can be injected widi our without various 
adjuvants to enhance inmiunogenicity. 
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Infectious non-replicating HCV particles. In another manifestation, HCV particles capable 
of receptor binding, entry, and translation of genome RNA can be produced. Heterologous 
expression approaches for production of such particles include, but are not restricted to, E, 
coli, yeast, or mammalian cell lines, appropriate host cells infected or harboring 
recombinant baculoviruses, recombinant vaccinia viruses, recombinant alphaviruses or 
RNA replicons, or recombinant adenoviruses, engineered to express appropriate HCV 
RNAs and proteins. In one example, two recombinant baculoviruses are engineered. One 
baculovirus expresses the HCV structural proteins (e.g. C-EI-E2-p7) required for assembly 
of HCV particles. A second recombinant expresses the entire HCV genome RNA. with 
precise 5' and 3' ends, except that a deletion, such as aGDD, is included to inactivate the 
HCV NS5B RDRP. Other mutations abolishing productive HCV replication could also be 
utilized instead or in combination. Coinfection of appropriate host cells (Sf9, Sf2I, etc.) 
with botii recombinants will produce high levels of HCV structural proteins and genome 
RNA for packaging into HCV-like particles. Such particles can be produced at high levels, 
purified, and used for vaccination. Once introduced into the yaccinee, such particles will 
exhibit normal receptor binding and infection of HCV-susceptible cells. Entry will occur 
and the genome RNA will be translated to produce all of the normal HCV antigens, except 
that further replication of the genome will be completely blocked given the inactivated 5B 
polymerase. Such particles are expected to elicit effective CTL responses against structural 
and nonstructural HCV protein antigens. This vaccination strategy alone or preferably in 
conjunction with the subunit strategy described above can be used to elicit high levels of 
both neutralizing antibodies and CTL responses to help clear the.virus. A variety of 
different HCV genome RNA sequences can be utilized to ensure broadly cross-reactive and 
protective immune responses. In addition, modification of the HCV particles, either 
through genetic engineering, or by derivatization in vitro, could be used to target infection 
to cells most effective at eliciting protective and long lasting immune responses. 



Uve-attenuated HCV derivatives. The ability to manipulate the HCV genome RNA 
sequence and diereby produce mutants with altered pathogenicity provides a means of 
constructing live-attenuated HCV mutants appropriate for vaccination. Such vaccine 
candidates express protective antigens but would be impaired in their ability to cause 
disease, establish chronic infections, trigger autohnmune responses, and transform cells. 
Naturally, infectious HCV virus of the invention can be attenuated, inactivated, or killed by 
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chemical or heat treatment. 

HCV-based Gene Expression Vectors 
Some of the same properties of HCV leading to chronic liver infection of humans niay also 
be of great utility for designing vectors for gene expression in cell culture systems, genetic 
vaccination, and gene therapy. The functional clones described herein can be engineered to 
produce chimeric RNAs designed for the expression of heterologous gene products (RNAs 
and proteins). Strategies have been described above and elsewhere [Bredenbeek and Rice, 
(1992) supra\ Frolov et aL, (1996) suprd\ and include, but are not limited to (i) in-frame 
fusion of the heterologous coding sequences with the HCV polyprotein; (ii) creation of 
additional cistrons in the HCV genome RNA; and (iii) inclusion of IRES elements to create 
multicistronic self-replicating HCV vector RNAs capable of expressing one or more 
heterologous genes (Figure 2). Functional HCV RNA backbones utilized for such vectors 
include, but are not limited to, (i) live-attenuated derivatives capable of replication and 
spread; (ii) RNA replication competent "dead end" derivatives lacking one or more viral 
components required {e.g, the structural proteins) required for viral spread; (iii) mutant 
derivatives capable of high and low levels of HCV-specific RNA synthesis and 
accumulation; (iv) mutant derivatives adapted for replication in different human cell types; 
(v) engineered or selected mutant derivatives capable of prolonged noncytopathic 
replication in human cells. Vectors competent for RNA replication but not packaging or 
spread can be introduced either as naked RNA, DNA, or packaged into virus-like particles. 
Such virus-like particles can be produced as described above and composed of either 
unmodified or altered HCV virion components designed for targeted infection of the 
hepatocytes or other human cell types. Alternatively, HCV RNA vectors can be 
cncapsidated and delivered using heterologous viral packaging machineries or encapsulated 
into liposomes modified for efficient gene delivery. These packaging strategies, and 
modifications thereof, can be utilized to efficiently target HCV vectors RNAs to specific 
cell types. Using methods detailed above, similar HCV-derived vector systems, competent 
for replication and expression in other species, can also be derived. 

Various methods, e,g, , as set forth si^ra in connection with transfection of cells and DNA 
vaccines, can be used to introduce an HCV vector of the invention. Of primary interest is 
direct injection of functional HCV RNA or virions. e,g,, in the liver. Targeted gene 
delivery is described in International Patent Publication WO 95/28494, published October 
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1995. Alternatively, the vector can be introduced in vzvc> by lipofection. For the past 
decade, there has been increasing use of liposomes for encapsulation and transfection of 
nucleic acids in vitro. Syndietic cationic lipids designed to limit the difficulties and dangers 
encountered with liposome mediated transfection can be used to prepare liposomes for in 
vivo transfection of a gene encoding a marker [Feigner, et. al., Proc, Natl. Acad. Sci, 
U,S,A. 84:7413-7417 (1987); ^eeMackey, etai, Proc. Natl. Acad. Sci. U.SA. 85:8027- 
8031 (1988); Ulmer etaL, Science 259:1745-1748 (1993)]. The use of cationic lipids may 
promote encapsulation of negatively charged nucleic acids, and also promote fusion with 
negatively charged cell membranes [Feigner and Ringold, Science 337:387-388 (1989)]. 
The use of lipofection to introduce exogenous genes into the specific organs in vivo has 
certain practical advantages. Molecular targeting of liposomes to specific cells represents 
one area of benefit. It is clear that directing transfection to particular cell types would be 
particularly advantageous in a tissue with cellular heterogeneity, such as pancreas, liver, 
kidney, and the brain. Lipids may be chemically coupled to other molecules for the 
purpose of targeting [see Mackey, et. al.. supra]. Targeted peptides, e.g., hormones or 
neurotransmitters, and proteins such as antibodies, or non-peptide molecules could be 
coupled to liposomes chemically. Receptor-mediated DNA delivery approaches can also be 
used [Curiel et al.. Hum. Gene Then 3:147-154 (1992); Wu and Wu, J. BioL Chem. 
262:4429-4432 (1987)]. 

Examples of applications for gene tiierapy include, but are not limited to, (i) expression of 
enzymes or other molecules to correct inherited or acquired metabolic defects; (ii) 
expression of molecules to promote wound healing; (iii) expression of immunomodulatory 
molecules to promote immune-mediated regression or elimination of human cancers; (iv) 
targeted expression of toxic molecules or enzymes capable of activating cytotoxic drugs in 
tumors; (v) targeted expression of anti-viral or anti-microbial agents in pafliogen-infected 
cells. Various therapeutic heterologous genes can be inserted in a gene therapy vector of 
the invention, such as but not limited to adenosine deaminase (ADA) to treat severe 
combined immunodeficiency (SCID); marker genes or lymphokine genes into tumor 
mfiltrating (TIL) T cells [Kasis etal., Proc. Natl. Acad. ScL U.S.A, 87:473 (1990); Culver 
etal., ibid. 88:3155 (1991)]; genes for clotting factors such as Factor VIII and Factor IX 
for treating hemophilia [Dwarki etal.Proc. Natl. Acad. Sci. USA, 92:1023-1027 (19950); 
Thompson. Thromb. and Haemostatis, 66:119-122 (1991)]; and various other well known 
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therapeutic genes such as. but not limited to. P-globin, dystrophin, insulin, erythropoietin, 
growth hormone, glucocerebrosidase, p-glucuronidase, a-antitrypsin. phenylalanine 
hydroxylase, tyrosine hydroxylase, ornithine transcarbamylase, apoiipoproteins, and the 
like. In general, sec U.S. Patent No. 5,399,346 to Anderson et aL 

Examples of applications for genetic vaccination (for protection from pathogens other than 
HCV) include, but are not limited to, expression of protective antigens from bacterial (e.g., 
uropathogenic E. colU Streptoccoci, Staphlococci, Nisseria), parasitic (e.g., Plasmodium, 
Leishmania, Toxoplama), fungal (e.g., Candida, Histoplasma) , and viral (e,g., HIV, HSV, 
CMV. influenza) human pathogens. Immunogenicity of protective antigens expressed using 
HCV-derived RNA expression vectors can be enhanced using adjuvants, including co- 
expression of immunomodulatory molecules, such as cytokines {e.g,, IL-2, GM-CSF) to 
facilitate development of desired Thl versus Th2 responses. Such adjuvants can be either 
incorporated and co-expressed by HCV vectors themselves or administered in combination 
with these vectors using other methods. 

Diagnostic Methods for Infectious HCV 
Diagnostic cell lines. The invention described herein can also be used to derive cell lines 
for sensitive diagnosis of infectious HCV in patient samples. In concept, functional HCV 
components are used to test and create susceptible cell lines (as identified above) in which 
easily assayed reporter systems are selectively activated upon HCV infection. Examples 
include, but are not restricted to, (i) defective HCV RNAs lacking replicase components 
that are incorporated as transgenes and whose replication is upreguiated or induced upon 
HCV infection; (ii) sensitive heterologous amplifiable reporter systems activated by HCV 
infection. In the first manifestation, cis RNA signals required for HCV RNA amplification 
flank a convenient reporter gene, such as luciferase, green fluorescent protein (OFF), p- 
galactosidase, or a selectable marker (see above). Expression of such chimeric RNAs is 
driven by an appropriate nuclear promoter and elements required for proper nuclear 
processing and transport to the cytoplasm. Upon urfection of the engineered cell line with 
HCV, cytoplasmic replication and amplification of the transgene is induced, triggering 
higher levels of reporter expression, as an indicator of productive HCV infection. 

In the second example, cell lines are designed for more tightly regulated but highly 
inducible reporter gene amplification and expression upon HCV infection. Although this 
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amplfied system is described in the context of specific components, other equivalent 
components can be used. In one such system, diagrammed in Figure 3, ^n engineered 
alphavirus replicon transgene is created which lacks the alphavirus nsP4 polymerase, an 
enzyme absolutely required for alphavirus RNA amplification and normally produced by 
cleavage from the nonstructural polyprotein. Additional features of this defective 
alphavirus replicon include a subgenomic RNA promoter, driving expression of a luciferase 
or GFP reporter gene. This promoter element is quiescent in the absence of productive 
cytoplasmic alphavirus replication. The cell line contains a second transgene for expression 
of gene fusion consisting of the HCV NS4A protein and the alphavirus nsP4 RDRP. This 
fused gene is expressed and targeted to the cytoplasmic membrane compartment, but this 
form of nsP4 would be inactive as a functional component of the alphavirus replication 
complex because a discrete nsP4 protein, with a precise N terminus is required for nsP4 
activity [Lemm et ai, EMBO J. 13:2925 (1994)]. An optional third transgene expresses a 
defective alphavirus RNA with cis signals for replication, transcription of subgenomic RNA 
encoding a ubiquitin-nsP4 fusion, and an alphavirus packaging signal. Upon infection of 
such a cell line by HCV, the HCV NS3 proteinase is produced and mediate trans cleavage 
of the NS4A-nsP4 fusion protein, activating the nsP4 polymerase. This active polymerase, 
which functions in trans and is effective in minute amounts, then forms a functional 
alphavirus replication complex leading to amplification of the defective alphavirus replicon 
as well as the defective alphavirus RNA encoding ubiquitin-nsP4. Ubiquitin-nsP4, 
expressed from its subgenomic RNA, is cleaved. efficiently by cellular ubiquitin 
carboxy terminal hydrolase to product additional nsP4, in case this enzyme is limiting. 
Once activated, this system would produce extremely high levels of the reporter protein. 
The time scale of such an HCV infectivity assay is expected to take just hours (for sufficient 
reporter gene expression). 

Antibody diagnostics. In addition to the cell lines described here. HCV virus particles 
(virions) produced by the transfected or infected cell lines, or isolated from an inflected 
animal, may be used as antigens to detect anti-HCV antibodies in patient blood or blood 
products. Because the HCV virus particles are derived from an authentic HCV genome, 
they are likely to have structural characteristics that more closely resemble or are identical 
to natural HCV virus. These reagents can be used to establish that a patient is infected with 
HCV by detecting seroconversion, i,e., generation of a population of HCV-specific 
antibodies. 
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Alternatively, antibodies generated to the authentic HCV products prepared as described 
herein can be used to detect the presence of HCV in biological samples from a subject. 

The present invention may be better understood by reference to the following non-limiting 
Examples, which are provided as exemplary of the invention. 

gXAMPLES 

The following examples report on the background experimental work, initial unsuccessful 
efforts to prepare an HCV DNA encoding infectious HCV RNA, and finally generation of a 
functional clone. 

EXAMPLE L Analysis of HCV-H Genome Structur e and Expression 
Rationale for the HCV-H strain, cDNA cloning, sequence analysis, and assembly of nearly 
full-length cDNA clones, HCV-H strain was chosen for the initial studies since this isolate 
has been extensively characterized in chimpanzees by Purcell and colleagues [see Shimizu 
et aL, (1990) supra] and more recently in vitro by Shimizu and coworkers [Hijikata et ai, 
(1993) supra; Shimizu et ai, J, Virol, 68: 1494-1 500 (1994); Shimizu et ai, Proc. Natl. 
Acad Sci USA 89: 5477-5481 (1992); Shimizu etaL, Proa Natl Acad, Sci, USA 90, 
6037-6041 (1993)]. HCV-H is a genotype la human isolate from an American with 
posttransfusion NANB hepatitis [Feinstone etaL, J. Infect. Dis. 144: 588-598 (1981)]. 

Initial cDNA cloning and sequence analysis of HCV-H. The original HCV-H77 isolate was 
passaged twice in chimpanzees, both of whom developed elevated serum ALT levels and 
acute hepatitis. Liver tissue from the second chimpanzee passage was used for preparation 
of crude RNA suitable for cDNA synthesis and nested PCR amplification. PCR-amplified 
cDNA was cloned into plasmid expression vectors and several independent clones were 
isolated and used for sequence analysis, expression studies and reconstructing longer cDNA 
clones. Utilizing partial sequence data and restriction enzyme mapping, a clone containing 
the nearly the entire HCV-H cDNA, called pTET/T7HCVFLCMR, was assembled and 
sequenced [Daemer era/., unpublished; Grakoui etai, 7. Virol 67: 1385-1395 (1993c)]. 
The HCV sequence contained in this plasmid is subsequently referred to as HCV-H CMR 
(SEQ ID NO: 19). The sequence of this clone is colinear and 98.5% homologous (at the 
nucleotide level) to the chimp-passaged HCV-H77 sequence published by Inchauspe et 
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a/.[Inchauspe et al., Proc, Natl Acad. Scl USA 88: 10292-.10296 (1991)] and shows even , 
greater similarity to the partial HCV-H90 sequences published by Ogata et a/. [Ogata et aL, 
{\99\) supraY 

Characterization of a prototype HCV-H clone. HCV-H cDNA clones and immune reagents 
have been used in cell-free translation and cell culture transient expression assays to provide 
a fairly detailed picture of HCV-H gene expression. In general terms, these results are 
similar to those obtained by others for different HCV genotypes. This work included: (i) 
the identification and mapping of HCV-H polyprotein cleavage products [Grakoui et aL, 
(1993c) supra\ Lin et aL, (1994a) supra]\ (ii) determining the sites of proteolytic processing 
[Grakoui et aL. J. Virol 67: 2832-2843 (1993a); Grakoui et ai, Proc. Natl Acad. Sci, 
USA 90: 10583-10587 (1993b); Lin etai, (1994a ) supra]; (iii) characterization of the 
NS2-3 autoproteinase [Grakoui etal., (I993h) supra; Reed etaL, J. Virol 69: 4127-4136 
(1995)], the NS3-4A serine proteinase [Grakoui et aL, (1993a) supra; Lin et aL, J. Virol 
68: 8147-8157 (1994b); Lin and Rice, Proc. Natl Acad, Sci. USA 92: 7622-7626 (1995); 
Lin et aL, J. Virol 69: 4373-4380 (1995)] and their cleavage requirements [Kolykhalov et. 
al., / Virol 68: 7525-7533 (1994); Reed et aL, (1995) supra\; (iv) studies on the NS4A 
serine proteinase cofactor and its association with NS3 [Lin et al, (1994b) supra; Lin and 
Rice, (1995) supra; Lin et al., (1995) supra]; and (v) an examination of HCV glycoprotein 
biogenesis including folding and association with calnexin, oligomer formation, and 
subcellular localization [Dubuisson et al, (1994) supra; Dubuisson and Rice, (1996) 
supra]. Assays for other biologically important activities have been developed using the 
prototype HCV-H cDNA clones, including RNA-stimulated NTPase and RNA helicase 
activities associated with partially purified NS3 [Suzich et al, (1993) supra] and an RNA- 
dependent RNA polymerase activity. Antigens expressed from this cloned cDNA can also 
be recognized by sera [see Ref. Grakoui et al, (1993c) supra] and cytotoxic T lymphocytes 
[Battegay a/., J. Virol 69: 2462-2470 (1995); Koziel etal., 1 Clin, Invest, 96:2311-21 
(1 995)] from patients with chronic HCV infections. 

For the present invention, the work on HCV polyprotein processing provided a means of 
prescreening candidate full-length clones for a functional IRES element, an intact ORF, and 
proper membrane topology and active viral proteinases as evidenced by the production of 
all 10 polyprotein cleavage products. 
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EXAMPLE 2, First Attempt At Recovery o f Functional HCV from cDNA 
Plasmid constructions. The preferred strategy for production of high specific infectivity 
potentially infectious HCV RNA transcripts [see Ahlquist et aL, Proc. Natl. Acad, ScL 
USA 81: 7066-7070 (1984); Rice etaL, NewBioL 1: 285-296 (1989); Rice etaL, (1987) 
supra and refs. therein], involved cloning of candidate full-length HCV cDNAs 
immediately dowastream from a bacteriophage promoter (SP6 or T7) with a unique 
restriction site following the HCV 3' terminus for production of run off RNA transcripts 
(Figure 4). The T7 or SP6 transcription systems were chosen for production of potentially 
infectious RNAs for several reasons. First, numerous examples exist for other RNA 
viruses where either T7 or SP6 have been successfully used to transcribe high yields of 
relatively high specific infectivity capped or uncapped RNA transcripts [Boyer and Haenni, 
J. Gen, Virol 198: 415-426 (1994)]. In addition, the T7 system is particularly useful since 
it allows not only in vitro synthesis of defined RNAs for transfection, but also several in 
vivo approaches using transfection of plasmid DNA. One example is the vaccinia-T7 
system where a vaccinia recombinant expressing the T7 RNA polymerase allows 
cytoplasmic transcription of transfected plasmid templates [Fuerst et ai, Proc. Natl Acad. 
Scu USA 83: 8122-8126 (1986)]. A second in vivo approach, obviating the need for 
vaccinia virus, is cotransfection of a plasmid expressing T7 RNA polymerase [Chen et al., 
(1994) supra]. Transfection with HCV plasmid DNAs, designed for production of 
transcripts with defined 5' and 3' termini, might be advantageous given the susceptibility of 
long RNAs to degradation during transfection procedures [Ball, (1992) supra; Pattnaik et 
al., (1992) supra]. However, these in vivo methods do not allow precise control over the 
strucmre of the transcribed RNA and their export to the cytoplasm where HCV RNA 
replication is believed to occur. Hence, the in vitro transcription method has usually 
employed in our work. 

The sequenced prototype HCV-H cDNA clone used for the majority of the processing 
studies was the starting material for these constructions. Since the terminal sequences of 
the HCV-H genome RNA were unknown when these experiments were initiated, sequences 
reported for other isolates were used to engineer the 5' and 3' ends by PGR. For the first 
set of constructs tested (Figure 4), the additional 5' terminal sequence was derived from 
HCV-1 isolate [Han et aL, (1991) supra]. For the 3' NTR, plasmids with two alternative 
structures were constructed. One pair (SP6 or T7) contained the 3' NTR and terminal poly 
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(A) tract reported for HCV-1 by Han [Han et aL, (1991) supra]. A second pair was 
constructed using a consensus 3' NTR sequence for all other isolates followed by a 3' 
terminal poly (U) tract. 

Methods for assaying infectivity ofHCVRNA. A desirable method for initial identification 
of potentially functional clones would be to screen for RNA replication after transfection of 
permissive cell cultures. While several laboratories have reported infection and replication 
in various cell cultures (see Background of the Invention, supra, and below), these systems 
are extremely inefficient, poorly characterized, and difficult to reproduce. Factors 
precluding efficient replication in vitro are unknown but may involve one or multiple stages 
in the virus life cycle (attachment, entry, RNA replication, assembly or release). 
Furthermore, no one has shown that HCV produced in cell culture is "authentic", e.g., 
capable of causing disease in the chimpanzee model. For these reasons, as well the 
technical difficulties associated with unambiguously demonstrating replication after RNA 
transfection, the chimpanzee model was used to identify functional clones from the library. 
Surgical procedures and direct intrahepatic inoculation were used, since this technique had 
been successful for demonstrating infectivity of rabbit hemorrhagic disease virus virion 
RNA [Ohlinger et al., J. Virol 64: 3331-3336 (1990)] and for hepatitis A virus RNA 
produced by in vitro transcription [Emerson et ai, J. Virol 66: 6649-6654 (1992)]. 

Capped or uncapped full-length RNA transcripts were synthesized from each of the four 
linearized plasmid templates and assayed for infectivity by direct intrahepatic inoculation of 
chimpanzee liver using a percutaneous liver biopsy technique. Briefly, after RNA 
transcription, reactions were digested with DNase, extracted with phenol, and the RNAs 
collected by etbanol precipitation. The yield and integrity of each transcript RNA was 
determined by agarose gel electrophoresis under denaturing conditions. Equal amounts of 
each of the poly (U)- or poly (A)-containing transcripts (SP6, T7, capped, uncapped) were 
pooled and assayed separately in two animals. These animals had not previously been 
exposed to HCV or pooled blood products and were HCV antibody and RNA negative. 
For each animal, two injection sites were used. At one site, 200 iMg pooled RNA in 1 ml 
RNase-free PBS was injected. At the second site, 200 /zg pooled RNA mixed with 0.8 ml 
RNase-free PBS and 200 LIPOFECTriN (BRL) was injected. Pre- and post-inoculation 
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plasma and liver biopsy samples were collected weekly. Plasma samples were assayed for . 
ALT and GGTP (indicators of liver damage), for HCV-specific antibodies using available 
serological assays, and for evidence of circulating HCV RNA by RT/PCR. Besides 
histologic examination of liyer biopsy tissue, samples were also stored for possible analysis 
by immunofluorescence and electron microscopy. Despite following the animals for 6 
months, no evidence of productive HCV infection was found using any of these assays. 

Using methods described more fully below, transcripts from these clones were also assayed 
for infectivity in several different cell types. In some cases, HCV antigens could be 
detected in transfected cells for several days; however, similar results were obtained using 
control HCV transcripts containing a deletion in the NS5B RDRP, which should be inactive 
for replication. TTius. no convincing evidence for replication was obtained in the first set of 
experiments. 

EXAMPLES. Second Attempt to Recover HCV from cDNA 
Possible reasons for failure of Attempt L Several possible explanations, alone or in 
combination, could account for previous unsuccessful attempts to recover infectious HCV 
RNA from prototype HCV-H clones (pTET/HCVFLCMR). These include missing or 
incorrect terminal sequences, internal errors deleterious or lethal for HCV replication, or 
inadequate methods for assaying infectivity and replication. To address the first concern, 
the HCV-H 5' and 3' terminal sequences were rigorously determined. To increase the 
chances of recovering a full-length clone free of deleterious errors, high fidelity RT/PCR 
and assembly PCR was used to construct a new library of full-length HCV-H clones which 
included the new terminal sequences. Multiple clones from the library were tested for 
infectivity in the chimpanzee model. 

Rationale for rigorously determining the HCV-H termini. As mentioned above, the 5' and 
3' terminal sequences of HCV-H were unknown; the previous attempts (Example 2) to 
generate functional transcripts were from cDNA clones bearing termmal sequences 
determined for other HCV isolates. Study in other RNA virus systems has shown that 
specific terminal sequences are critical for the generation of functional, replication 
competent RNAs [reviewed in Boyer and Haenni, (1994) suprd\. Such sequences are 
believed to be involved in initiation of negative- and positive-strand RNA synthesis. In 
some cases, a few additional bases, or even longer non-viral sequences, are tolerated at the 
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5' and 3' termini; these sequences are typically lost or selected against during authentic 
viral replication. For other RNA viruses, extra bases, particularly at the 5' terminus, are 
deleterious. In contrast, transcripts lacking authentic terminal sequences are usually non- 
functional. For instance, deletion of the 3' terminal secondary structure or conserved 
sequence elements in the 3' NTR of flavivirus genome RNA is lethal for YF or TBE RNA 
replication. Given the importance of these sequence elements for other viruses, we have 
attempted to more rigorously determine the HCV-H terminal sequences. 

Structure of the HCV-H 5' NTR, Methods used to amplify and clone the extreme 5' termini 
of RNAs include homopolymer tailing or ligation of synthetic oligonucleotides to first- 
strand cDNA (5' RACE) [Schaefer, Anal. Biochem, 227: 255-273 (1995)], cyclization of 
first-strand cDNA followed by inverse PGR [Zeiner and Gehring, BioTechniques 17: 
1051-1053 (1994)], or cyclization of genome RNA with RNA ligase (after treatment to 
remove 5' cap structures, if necessary) followed by cDNA synthesis and PGR amplification 
across the 5'-3' junction [Mandl et al., Biotechniques 10: 486 (1991)]. Each of these 
approaches has its own set of problems, especially for rare RNAs. Despite this, 5' terminal 
sequences have been determined for a number of HCV isolates and are in general 
agreement. For HCV-H, both the cyclization/inverse PGR and 5' RACE methods were 
used to determine a 5 '-terminal consensus sequence for HCV-H RNA from high titer H77 
plasma (new data for HCV-H are shown in bold): 

5'-GCCAGCCCCCTGATGGGGGCGACACTCCACCATQAATC...-3' (SEQ ID NO:3) 
This sequence is highly homologous to those determined for other isolates, but differs from 
our prototype full-length cDNA sequence at two positions (underlined). At lower 
frequency, clones with additional 5' residues (usually 1 additional G) were also recovered. 
Table 1 summarizes the results of the 5' terminal analyses. 
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Table 1 . Results of the 5' end analysis of the HCV H cDNA clones 



Number of Clones 


5' end 


18 


GCCAGCC... 


3* 


NCCAGCC... 


18* 


NNCCAGCC... 


9 


GGCCAGCC... 


3 


TGCCAGCC... 


1 


AGCCAGCC... 


2 


AAGCCAGCC... 


1 


GCGCCAGCC... 



♦Sequences were not determined; the number of nucleotides on the 5' end was determined 
by relative electrophoretic mobility of restriction fragments. 



Eighteen clones began with the sequence 5 '-GCCAGCC... -3'; nine clones with the 
sequence 5'-GGCCAGCC...-3'; three clones with the sequence 5'-UGCCAGCC,..-3'; one 
clone with the sequence 5 '-AGCCAGCC... -3'; two clones with the sequence 5'- 
AAGCCAGCC...-3'; and three clones with the sequence 5 '-GCGCCAGCC... -3'. Besides 
these sequenced clones, eighteen clones with one additional 5' base were identified by 
restriction analysis. Of note is the observation that a sequence reported for a genotype lb 
isolate initiates with a U residue (5'-UGCCA...-3'). Although these results might indicate 
the presence of additional sequences or heterogeneity at the HCV 5' terminus, the 
additional bases may be artifactual and created by partial copying of a 5' cap structure or 
addition of non-templated 3' bases by reverse transcriptase during first-strand cDNA 
synthesis. It cannot be excluded that the 5' terminus of HCV genome RNA contains a 5' 
cap structure or a covalently linked terminal protein such as VPg of the picornavinises 
[Vartapetian and Bogdanov, Prog Nucleic Acid Res Mol Biol 34:209-51 (1987)]. These 
possibilities will remain unresolved until it becomes possible to directly determine the 
structure of the 5' terminus of HCV genome RNA. For the pestiviruses, recent results 
suggest that genome RNAs may not contain a 5' cap [Brock et al., J, Virol, Metk 38: 
39-46 (1992)] and that this structure is not required for infectivity of transcribed RNA 
[Meyers era/., J. ViroL 70: 8606-8613 (1996a); Meyers etal^JVirol 70: 1588-95 
(1996b); Moormann etal, J Virol 70: 763-70 (1996); Ruggli et al,, J Virol 70: 3478-87 
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(1996); Vassilev et aL, J, Virol 71: 471-478 (1997)]. 

Structure of the HCV-H 3' NTR. Determination of the extreme 3' terminal HCV sequences 
is describe in co-pending, co-owned U.S. Patent Application Serial No. 08/520,678, filed 
August 29, 1995, which is incorporated herein by reference in its entirety, and PCT 
International Application No. PCT/US96/ 14033, filed August 28, 1996. Briefly, these 
results showed that the HCV 3' NTR consists of three elements (positive-sense, 5' to 3'): 
(i) a short sequence with significant variability among genotypes; (ii) a homopolymeric poly 
(U) tract followed by a polypyrimidine stretch consisting of mainly U with interspersed C 
residues and; (iii) a novel sequence of 98 bases. This novel 98-base sequence was not 
present in human genomic DNA and is highly conserved among HCV genotypes. The 3'- 
terminal 46 bases are predicted to form a stable stem-loop structure. Using a quantitative- 
competitive RT/PCR assay, a substantial fraction of HCV genome RN As from a high 
specific-infectivity inoculum were found to contain this 3' terminal sequence element. 
These results indicated that the HCV genome RNA terminates with a highly conserved 
RNA element, which is likely to be required for authentic HCV replication and therefore, 
for recovery of infectious RNA from cDNA. These results have been confirmed by two 
other groups (Tanaka et al, (1995) supra\ Tanaka et aL, (1996) supra; Yamada et al, 
(1996) supra], A large number of clinical isolates have also been examined and shown to 
contain the novel conserved 3' terminal element [Umlauft etaL, J. Clin. Invest, 34: 
2552-2558 (1996)]. 

Recipient vector containing the HCV H77 5' and 3' consensus sequences. Based on our 
analysis of the HCV H terminal sequences, a recipient vector was constructed that 
contained the determined consensus H77 sequences 5' to the Kpnl (580) and 3' fo the Not\ 
(9219) site (these terminal HCV sequences are identical to those in p90/HCVFlong pU, see 
below, SEQ ID N0:5). This vector is designated pTET/T7HCVABglII/5'3' corr. and was 
used for construction of the combinatorial full-length library described below. 

Additional considerations for construction offulUlength cDNA libraries for the HCV-H 
strain. As for the previous attempt (Example 2), the strategy for the second try involved 
the construction of full-length cDNA templates in plasmid vectors that could be transcribed 
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in vitro or in vivo using bacteriophage DNA-dcpendent RNA polymerases. Besides having, 
correct 5' and 3' termini, RNA transcripts must also encode a full complement of functional 
HCV polypeptides. To minimize the possibility of cloning defective HCV genomes, high 
specific infectivity HCV-H plasma (H77) was used as a source of virion RNA for our new 
libraries (as mentioned earlier, the previous clone was assembled from cDNA made from 
infected chimp liver RNA). However, reverse transcription and multiple cycles of 
amplification prior to cDNA cloning raised the chances that HCV cDNA templates would 
contain one or more mutations deleterious for virus replication. For these reasons, complex 
libraries of full-length clones were constructed using high fidelity assembly PGR and then 
screened in pools for production of infectious RNA. 

Construction of a new library of full-length HCV-H cDNA clones. We screened 41 HCV 
primer pairs and found 1 1 sets useful for amplifying overlapping 1-4 kb portions of the 
genome RNA (Figure 5 and Tables 2 and 3). 
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Table 2. Oligonucleotides used for amplification of HCV-H cDNA. 



Name 


Sequence (5 to 3 ) 


SEQID 
NO: 


position in HCV-H 
and orientation 


SF49 


^^^^ ^^^^ h k. ^^^^^^^^ A A A A 

GGCGACACTCCACCATAGATC 


6 


(+) 18-38 


SF128 


TGGCACTACCCTCCAAGACC 


7 


(+) 1800-1819 


SF162 


ATGACACAAGGGGGCGCTCCG 
CACACT 


8 


(-) 2027-2053 


SF131 


TCCTGCTTGTGGATGATG 


9 


2538-2555 


SF152 


TAGTTTGGTGATGTCA 


10 


(-) 2999-3014 


PCL10067 


ACATAGGTGCCAGTAAG 


il 


(-) 3171-3188 


PCL10066 


CTGGCAACGTGCATCA 


12 


(+) 3549-3564 


CMRUS 


GGCTGAGAACAATTACCA 


13 


{+) 4183-4200 


CMR117 


ATTGATGCCCAATGCG 


14 


(-) 4565-4580 


SF140 


ACTGCCTGGGATTCCCT 


15 


6347-6363 


SF155 


CCACAGTGGCAGCGAGTG 


16 


(-) 6419-6436 


SF156 


CATGGACGtCAACACG 


17 


(-) 6848-6863 


SF1045 


AATCTTCACCGGTTGGGGAGG 
AGGTAGATG 


18 


(-) 9353-9391 
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Tables. Fragments and primers used in original and assembly PGR. 



Fragments in 
assembly 


Primer pairs 


Resulting 
fragmentj 


Position in 
start* 


HCV senome 
end* 


Original PGR 


SF49, SF162 


A 


39 


2026 


Original PGR 


SF128, SF152 


B 


1820 


2998 


Original PGR 


SF128, PLG10067 


G 


1820 


3170 


Original PGR 


SF131, GMR117 


D 


2556 


4564 


Original PGR 


PGL10066. SF155 


E 


3565 


6418 


Original PGR 


CMR115, SF156 


F 


4201 


6847 


Original PGR 


SF140, SF1045 


G 


6364 


9352 


A+B 


SF49. SF152 


H 


39 


2998 


A+G 


SF49, PGL10067 


J 


39 


3170 


B+D 


SF128, GMR117 


L 


1820 


4564 


J+L 


SF49, GMR117 


K 


39 


4564 


F+G 


GMR115,SF1045 


M 


4201 


9352 


E+G 


PGL10066,SF1045 


N 


3565 


9352 


L+M 


SF128, SF104S 


0 


1820 


9352 


H+0 


SF49, SF1045 


#2 


39 


9352 


J+O 


SF49, SF1045 


#3 


39 


9352 


K+N 


SF49, SF1045 


#5 


39 


9352 


K+M 


SF49, SF1045 


#6 


39 


9352 



♦excluding primer 



t see Figure 5 

A mixture of diermostable enzymes were used to reduce error frequency and enhance 
synthesis of full-length products [Barnes, Proc, NatL Acad. Set USA 91: 2216-2220 (1994); 
Lundberg et al. Gene 108: 1-6 (1991)]. Such mtermediate PCR products were combined 
to produce fulMength HCV cDNA using sequential roimds of assembly PCR [MuUis et al.. 
Cold Spring Harbor Symp. 51: 263-273 (1986); Stemmer. (1994) jttpra]. Assembly PCR 
utilized primers at the extreme termini of the two overlapping fragments to be combined 
and a limited number of amplification cycles (Figure 6). This approach has the advantage 
of generating complex combinatorial libraries which should contain some fraction of 
functional error-free HCV cDNA templates. A prime consideration for this approach is 
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making sure that the library contains sufficient complexity to assure that some clones will 
be error-free. For each of the initial amplification reactions, dilutions of the first-strand 
cDNA were tested (Figure 7) to show that multiple independent cDNA molecules were 
being amplified (greater than 7 to 100; indicated in Figure 5). As shown in Figure 7, the 
fiilMength library contained greater than 5.6 x 10^ (80 x7 x 10 x 10 x 10) different 
combinations. Possible deleterious mutations could have been introduced into half of the 
clones if the primer sequences chosen for PGR amplification and assembly were incorrect. 
However, it was later verified that no heterogeneity existed in the sequences corresponding 
to the primers used for PGR. 

The majority of the HCV-H77 genome (from nucleotide 39-9352) was assembled and 
amplified in this manner and cloned as a Kpnl (580)-M?rI (9219) fragment into recipient 
plasmid (pTET/T7HCVABgin5'3'corr.) to produce the full-length library. As described 
above, pTET/T7HCVABglII5'3'corr. contains the T7 promoter, the consensus HCV-H 5' 
and 3'-terminal sequences 5' to the l^nl site and 3' from the Notl site, and a Hpal site for 
template linearization and production of run-off RNA transcripts. It should be noted that 
linearization with Hpal is predicted to produce run-off transcripts that contain one extra 3' 
U residue. 

Clones from the library were chosen for infectivity assays based on two criteria. First, 
series of restriction digests were performed to eliminate clones that had obvious deletions or 
insertions in the HCV cDNA. Two hundred thirty-three clones were analyzed and clones 
passing this screen were then analyzed using the vaccinia-T7 transient expression system 
[see Grakoui et aL, (1993a) supra; Grakoui et aL, (1993c) suprd\ for production of the 
expected HCV polyprotein cleavage products. Full-length clones could be analyzed 
directly using this technique, since preliminary studies in BHK cells showed that the HCV 
IRES functions nearly as efficiently as the EMCV IRES for expression of HCV 
polypeptides. One hundred twenty-nine clones were screened using a polyclonal antiserum 
from a patient with chronic HCV (JHF; Grakoui et aL, 1993c ); 49 clones were analyzed 
for production of NS5B, the C-terminal protein in the HCV-H ORF [Grakoui et aL, 1993a; 
Grakoui et al., 1993c ). Thirty-four clones passing these tests (expected restriction pattern; 
intact ORF and proper processing; NS5B production) were selected for in vitro 
transcription of potentially infectious RNA and infectivity analysis. 
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Special conditions for transcription ofjull-length HCV RNA containing the internal poly 
(U/UC) tract and the 98-base element. For T7-driven transcription, in vitro transcription 
conditions were optimized and showed that the resulting RNAs contain the extreme 3' 
terminal sequence. This was of special concern since the T7 RNA polymerase termination 
signals (a secondary structure followed by poly-U) resemble the HCV sequences preceding 
the 3' novel element and we observed termination at this site. In addition, the enzyme 
seemed to be prone to premature termination inside the poly (U/UC) tract. As shown in 
Figure 8A, by raising the UTP concentration to 3 mM in the transcription reaction, high 
yields of full-length HCV RNA transcripts were obtained. T7 polymerase was clearly 
better in this regard than SP6 polymerase, which exhibited significant premature 
termination in the poly (U) tract even at relatively high concentrations of UTP. 

Chimpanzee experiment }} 
Essentially as described above (Example 2), surgical procedures and direct intrahepatic 
inoculation were used to assay the infectivity of transcribed RNAs. Three animals, not 
previously used for HCV work and negative for HCV serology and RNA, were inoculated. 
Each of two of the animals were injected with RNA transcripts from 17 independent clones, 
with inoculations at 34 separate sites in the liver. Two separate inoculations used for each 
transcript preparation were: 50-100 /ig RNA in PBS injected at one site and 1 //g RNA 
mixed with 10 /xg lipofectin (a cationic liposome which enhances RNA transfection [see 
Rice et at,, (1989) supra] at a second site. This procedure was intended to maximize the 
chances of productive transfection for each clone/RNA preparation. As a negative control, 
a third animal (Chimp 1557) was similarly inoculated at 34 sites with transcripts ( - 1500 
^£g) which contained a 21 residue in-frame deletion in NS5B encompassing the active site of 
the HCV RNA-dependent RNA polymerase (called aGDD). Following inoculation, serum 
samples were collected (at weekly intervals) and analyzed for HCV RNA, elevation of liver 
transaminases, and HCV-specific antibody. Neither experimental anunal nor the negative 
control animal (aGDD) exhibited signs of productive infection (circulating HCV RNA, 
elevated liver enzymes, histopathology). Of note for future experiments was the complete 
absence of detectable circulating HCV RNA even as eariy as one week after inoculation. 

EXAMPLE 4: Successful Recovery of Infect ious HCV from cDNA 
Determination of the HCV-H consensus sequence. Since the limited pool screening 
approach was unsuccessful, we determined a complete consensus sequence for the HCV-H 
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strain. Segments of these sequenced clones were used for directed assembly of full-length _ 
HCV-H clones having the consensus sequence. This procedure was expected to eliminate 
lethal mutations, which might have occurred during cDNA synthesis or PGR amplification, 
or which existed in the original HCV population. Accordingly, the consensus method had a 
strong chance of producing functional HCV. 



Table 4. Sequence information used to determine an HCV-H consensus sequence 
Desi gnation DgSCriptiOKli 
HCV-H CMR 



HCV-H GenBank 



CMR prototype HCV-H cDNA clone; infected 
chimp liver RNA (SEQ ID NO: 19) 

HCV-H sequence 



AAK#83 



Combinatorial library clone #83; H77 serum 



AAK#84 



Combinatorial library clone #84; H77 serum 



AAK#86 
AAK#87 
AAK#89 
AAK#90 
AAK#92 
AAK#93 
AAK#96 
AAK#99 
AAK#101 



Combinatorial library clone #86; H77 serum 
Combinatorial library clone #87; H77 serum 
Combinatorial library clone #89; H77 serum 
Combinatorial library clone #90; H77 serum 
Combinatorial library clone #92; H77 serum 
Combinatorial library clone #93; H77 serum 
Combinatorial library clone #96; H77 serum 
Combinatorial library clone #99; H77 serum 
Combinatorial library clone #101; H77 serum 



AAK#248 
AAK#227 
AAK#213 



Combinatorial library clone #248; H77 serum 
Combinatorial library clone #227; H77 serum 
Combinatorial library clone #213; H77 serum 
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Combinatorial library clone #211; H77 serum 
Combinatorial library clone #209; H77 serum 
Combinatorial library clone #12; H77 serum 



Complete sequences between the Kpnl (580) and Notl (9219) sites in the HCV cDNA were 
determined for clones AAK#248, AAK#227, AAK#213, AAK#211, AAK#209, and 
AAK#12, Sequences for the prototype HCV-H CMR [Daemer et al,, supra; Grakoui et 
aL, (1993c) supra] and HCV-H GenBank (Tnchauspe et al., (1991) supra] had been 
determined previously. These sequences are aligned in Figure 9. Dots indicate positions 
identical to the HCV-H CMR sequence, shown at the bottom (SEQ ID N0S:19 and 20); 
dashes indicate gaps; the sequence "PCR seq" was detennined by direct sequencing of 
PCR-amplified HCV-H77 cDNA. Sequences of additional clones from our combmatorial 
library (AAK#83, #84, #86. #87, #89, #90. #92. #93, #95, #96. #99, #101) were 
determined for the HVRl hypervariable region in E2 (most were sequenced between 
nucleotides 1464-1823; see below). Inspection of the alignment reveals an HCV H77 
consensus sequence (SEQ ID N0:1) at most positions. At some positions, however, no 
clear consensus sequence emerged. These variable positions were: 2170 (Gac versus Aac; 
variable base is indicated in upper case type), 3940 (gAg versus gGg), and 5560 (caA 
versus caT). In these cases, the sequence used in the consensus clone corresponded to the 
nucleotide yielding the amino acid found at that position for the majority of sequenced HCV 
isolates. 

Regarding determination of a consensus sequence, additional areas of the HCV genome 
deserve further comment. First, the N-terminal portion of E2 is highly variable and 
believed to be the target of immune selection [Houghton, (1996) supra]. In the H77 
sample, considerable variability exists in HVRl [see Nakajima etaL, J Virol 70: 3325*9 
(1996); Ogata et aL, (1991) supra]. Multiple independent clones from this region were 
sequenced and the predominant HVRl sequence in each position was used in the consensus 
clones. The predominant sequence utilized differs in one position from that determined by 
others [Inchauspe et aL, (1991) supra; Nakajima et al., (1996) supra; Ogata et aL, (1991) 
supra. However, it is highly similar to that of the prototype HCV-H clone, which was 
derived from liver RNA isolated from an H77-inoculated chimpanzee. Hence, it seemed 
that this sequence would be tolerated for HCV replication in chimps. As shown below, this 
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sequence was functional but it is likely that many other HVR sequence variations will also 
be tolerated. 

A second region of the HCV-H sequence, the length and composition of the 3' NTR poly 
(U/UC) tract, was not determined unambiguously. Sufficient quantities of double-stranded 
cDNA could not be obtained for direct cloning of this region witliout resorting to PCR 
amplification. PCR amplification can contract and possibly expand the length of this 
homopolymer tract. Thus, clones resulting from this procedure may not reflect the native 
HCV genome RNA structure. In multiple independent clones derived by PCR 
amplification, the length of this tract varied from 41 to 133 nucleotides (see Kolykhalov et 
al., 1996 and Patent Application Serial No. 08/520,678). Hence, two different lengths of 
poly (U/UC) tract were tested: "short" (75 bases) or "long" (133 bases). The length of the 
"short" tract is actually about the medium length for all sequences (from different 
genotypes) reported by us [Kolykhalov et aL, (1996) supra] or others [Tanaka et ai, (1995) 
supra; Tanaka etaL, (1996) supra; Yamada et aL, (1996 ), supra]. The "long" tract was 
only recovered in one HCV-H clone (pGEM3Zf(-)HCV-H3'NTR#10); a tract of similar 
length was recovered in one clone of genotype 4 isolate WD [Kolykhalov et aL, (1996) 
supra]. Such long poly (U/UC) tracts have not yet been reported by others Tanaka et al., 
(1995) supra; Tanaka et aL, (1996) supra; Yamada et aL, (1996) supra]. 

Variations in 5' -terminal sequences, silent markers, length of 3' NTR poly (U/UC) tracts, 
and 3' run-off site. Given that additional bases were found at the 5' end of some HCV 
cDNA clones and the uncertainty about the length of the poly (U/UC) tract, several 
alternative clones were created. Silent nucleotide substitutions were incorporated in the 
ORF to serve as markers for identifying which derivatives were functional in later analyses 
and to demonstrate that replicating virus was in fact recovered from the assembled cDNA 
clones. Replacing the previously used Hpal site, a Bsml site was created following the 3' 
end of the HCV cDNA to allow for production of run-off transcripts corresponding to the 
precise 3' end of HCV genome RNA. Details describing these constructions follow: 

Additional bases at the 5' terminus. A recipient clone containing the most frequent 5' 
terminal sequence (5'-GCCA...-3') called pTET/T7HCVABglII/5'-f 3'corr. was modified 
by subcloning a Er^HII (479) to Kpn\ (580) fragment from pTET/HCV5'T7G3'AFL, one of 
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the prototype HCV-H cDNA clones tested in chimpanzees, to create 
p67/HCV.Bgiri/5'+37XhoI-. These clones differ by presence of.Xkol site at position 
514 (pTET/T7HCVABgin/5'+3'corr.) or its absence (p67/HCVABglII/5'+3'/XhoI-) 
p67/HCVABgm/5'+3'/XhoI- was then used as the vector for construction of four 
derivatives with different 5' terminal sequences. These are: 



Plasmid 

p70/HCVABglII/5' +37XhoI-/GG 
p71/HCVABglII/5' +37XhoI-/GAG 
p72/HCVABglII/5 ' + 3 7XhoI-/GUG 
p73/HCVABglII/5' +3'/XhoI-/GCG 



5' sequence of T7 transcript Marker (position) 

5'-GGCCA....3' XhoI-(514) 

5'-GAGCCA...-3' Xhol- (514) 

5'-GUGCCA...-3' Xhol- (514) 

5'-GCGCCA...-3' Xhol- (514) 



These derivatives were constructed using appropriate synthetic oligonucleotides and PGR 
amplification and their structures verified by sequence analysis. 

Assembly of a clone containing the consensus sequence between Kpnl (580) and NotI 
(9219). A schematic of the assembly steps is shown in Figure 10. The 7 sequenced HCV- 
H clones were used to assemble a prototype consensus clone. The plasmid source, position 
in the HCV cDNA, and restriction sites used for assembly are summarized in Table 5. 



Table 5. Clones, fragments, and restriction sites used for consensus clone 
construction. 



Source of fragment 
number of clones 


Position in HCV genome 


Restriction sites used 


313 


580-1046 


Kpnl-JOio I 


248 


1046-1174 


Xho l-PpuM I 


12 


1174-1357 


PputA I-Ba/nH I 


209 


1357-1482 


BamH l-Sal I 


227 


1482-1748 


Sall-PpuMl 


209 


1748-1908 


PpuU l-Asc I 


227 


1908-2108 


Asa l-BspE I 


312 


2108-2322 


BspE l-Sst I 


CMR 


2322-2440 


Sst l-Sca I 
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213 


2440-2526 


Sea l-BssH 11 


CMR 


2526-2828 


BssU l\-Hinf\ 


211 


2828-2978 


Hirtfl-BsrGl 


209 


2978-3236 


BsrG \-Bgl U 


227 


3236-3478 


Bgl\\-Bgl\ 


209 


3478-3733 


Bgl l-SexA I 


12 


3733-3942 


SexA l-Bfa I 


211 


3942-4069 


Bfa 1-Spl I 


227 


4069-4545 


Spl \-Sst I 


248 


4545-4646 


Sstl-Sall 


211 


4646-4976 


Sal l-Sma I 


227 


4976-5610 


Sma l-^Oio I 


209 


5610-5750 


Xho l-Eae I 


CMR 


5750-6209 


Eae \-Bsu36 1 


213 


6209-6302 


Bjk36 l-Blp I 


227 


6302-7529 


Blp l-Blp I-BamH I 


213 


7529-9219 


BanM \-Not I 


209 


7861-8205 


Hindm-EcoRl 



The final step in the assembly involved subcioning the KpnhNotl consensus region into 
recipient vector pTET/T7HCVABglII/5'+3'corr to produce p61/HCVFLcons. 

Introduction of a Bsral' substitution in the HCV cDNA and a BsmI run off site. Since the 
previously used Hpal run off site resulted in transcripts with an additional 3' terminal U 
residue which might be deleterious, clones were re-engineered so that transcripts 
terminating at the exact HCV 3' nucleotide could be synthesized. This was accomplished 
by positioning a Bsml site at an appropriate position downstream from the HCV 3' 
tenninus. Cleavage with Bsml produces a template strand which terminates at the position 
corresponding to the HCV 3' terminus. Since the H77 consensus sequence contains a Bsml 
site at position 5934, this site was inactivated with a translationally silent substitution 
engineered by site-directed mutagenesis. 
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The first step in this series of constructions was to inactivate the Bsml site in the HCV H77 
cDNA. This clone, called p62mCVFUons/Bsm(-) was created in a four fragment ligation 
which included: (1) annealed synthetic oligos between Sad (5923) and Sau3Al (5942) 
which contained a silent substitution inactivating the Bsml site (C instead of A at position 
5934); (2) Nsil (5282) to Sacl (5923) fragment from p61/HCVFLcons; (3) Sau3Al (5942) 
to Bsu36l (6209) from p61/HCVFLcons; (4) Bsu36l (6209) and Nsil (5282) digested 
p61/HCVFLcons. p62/HCVFLcons/Bsm(-) was sequenced completely verifying the 
structure of the assembled consensus clone, the presence of a silent marker mutation at 
position 899 (C instead of T). the ablated Bsml site, and a silent marker mutation at position 
8054 (see below). 

Intermediate plasmid p65/3'HCVBsm(+)/Not-Mlu, containing the 3' Bsml run off site, was 
created by the following three fragment ligation: (1) annealed synthetic oligos between 
Sau3Al (9639) and MM (9656) containing the Bsml site [5'-tgTcgcattc-3' (SEQ ID 
N0:21); the nucleotides in bold indicate the Bsml site, the upper case nucleotide 
corresponds to the 3' terminal base of the HCV genome]; (2) Notl (9219) to &«3AI (9639) 
fragment from p62/HCVFLcons/Bsm(-); (3) Mlul (9656) to Notl (9219) from 
p61/HCVFLcons. Note that this clone contains both the internal Bsml site (5934) and the 
engineered Bsml run-off site. 

The original consensus full-length clone. p61/HCVFLcons, contained a silent substinition in 
the NS5B coding region (A instead of G at position 8054). This substitution was used as a 
marker to distinguish between clones containing "short" poly (U/UC) tracts (these clones 
contain A at position 8054) or "long" poly (U/UC) tracts (with G at position 8054). 
p90/HCVFUong pU (SEQ ID N0:5), containing long poly (U/UC) and G at position 8054. 
was constructed by ligation of four fragments: (1) Xbal (-20) to Hindlll (7861) from 
p62/HCVFLcons/Bsm(-); (2) HindlU (7861) to EcoRl (8205) from library clone AAK#209 
(Figure 9) containing the G residue at position 8054; Ecom (8205) to Notl (9219) from 
p62/HCVFLcons/Bsm(-); Notl (9219) to Xbal (-20) from p65/3'HCVBsm(+)/Not-Mlu. 



p91/HCVFLshort pU. a derivative containing the "short" poly (U/UQ tract and the silent 
marker A at position 8054. was created by ligation of the following fragments: (1) Bgtt 
(9398) to Mel (9520) from pGEM3Zf(-)HCV-H3'NTR#8; (2) Nhel (9520) to Mlul (9597) 
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from p65/3'HCVBsm(+)/Not-Mlu; Mlul (9597) to Notl (9219) from 
p62/HCVFLcons/Bsm(-). Note that numbering for this construction refers to the final 
p91/HCVFLshort pU sequence. 

To generate the final set of full-length constructs with long poly (U/UC) and additional 
nucleotides at the 5' terminus, the Kpnl (580) to Mlul (9656) fragment from 
p90/HCVFLlong pU was cloned into p70/HCVABglII/5'+3'/Xhol-/GG, 
p71/HCVABgin/5'+37XhoI-/GAG, p72/HCVABglII/5'+3'/XhoI-/GUG, and 
p73/HCVABglII/5'+37XhoI-/GCG to create p92/HCVFLlong pU/5'GG, p93/HCVFLlong 
pU/5'GAG. p94/HCVFLlong pU/5'GUG, p95/HCVFLlong pU/5'GCG, respectively. 

To generate the analogous set of full-length constructs with short poly (U/UC), the Kpnl 
(580) to Mlul (9597) fragment from p9l/HCVFLshort pU was cloned into 
p70/HCVABglII/5'+37XhoI-/GG, p71/HCVABgUI/5' + 37XhoI-/GAG, 
p72/HCVABgUI/5'+37XhoI-/GUG, and p73/HCVABglII/5'+37XhoI-/GCG to create 
p96/HCVFLshort pU/5'GG, p97/HCVFLshort pU/5'GAG. p98/HCVFLshort pU/5'GUG, 
p99/HCVFLshort pU/5 'GCG, respectively. 

The salient features of these 10 clones [5' bases, silent markers, poly (U/UC) length] are 
summarized in Figure 11. Plasmids were propagated in E. coll (tet* SURE strain) and 
purified plasmid DNAs were prepared by standard methods, including twice banding on 
CsCI gradients [Ausubel et a/., Current protocols in molecular biology, eds. Greene 
Publishing Associates, New York (1993); Sambrook et al,. Cold Spring Harbor Laboratory, 
Cold Spring Harbor, NY (1989)]. 



Transcription of full-length RNAs. As mentioned above, increasing the UTP concentration 
to 3 mM in T7 transcription reactions increased the yield of full-length HCV RNAs, by 
facilitating readthrough of the poly (U/UC) tract. The skewed ratio of UTP (3 mM) to the 
other rNTPs (1 mM) could lead to increased misincorporation of U residues, in particular 
late in the transcription reaction when the other NTPs were substantially depleted. This 
concern was avoided by increasing the concentration of the other three NTPs to 3 mM. 
Purified plasmid DNAs were digested to conviction with Bsml, extracted once with phenol- 
chloroform and precipitated with ethanol [Ausubel et al,, (1993) supra; Sambrook et al.. 
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(1989) supra]. DNA pellets were washed with EtOH to remove salts and resuspended in 
RNase-free HjO. Transcription reactions (100 ^1) contained the following components: 10 
fzg fe/Ml-Iinearized template DNA, 40 mM Tris-Cl, pH 7.8, 16 mM MgC12, 5 mM DTT, 
10 mM NaCl, 3 mM each rNTP, 100 units T7 RNA polymerase, and 0.02 U inorganic 
pyrophosphatase. After a 1 hour incubation at typical yields were approximately 300 
Hg with greater than 80% full-length RNA as estimated by gel electrophoresis (Figure 8B). 

Chimpanzee experii^ypt 
Transcripts from the ten consensus clones were used to inoculate two different anhnals. 
using essentially the same surgical procedures described above. Protocols were reviewed 
and approved by the FDA and NIH Animal Studies Committees. Animals were 
seronegative for all hepatitis viruses, negative for HCV RNA by nested RT-PCR, and had 
normal baseline levels of liver enzymes. Two different inoculation/transfection protocols 
were employed. For chimpanzee #1535, the 100 ;zl transcription reactions were diluted 
with 400 //I PBS and stored frozen at -SO'^C until used for inoculation. These storage 
conditions were tested and shown to have no observable effect on the integrity of HCV 
RNA transcripts. Prior to inoculation, samples were thawed and each sample was injected 
intrahepatically at two sites (-0.25 ml/site). Injection sites for the 10 clones were 
distributed in three lobes of the liver. As a positive control for this procedure, chimpanzee 
#1557 was inoculated similarly with RNA transcripts from two different hepatitis A virus 
clones. In this case, 80-100 //g of transcribed RNA per clone was inoculated at two sites. 
A third animal, chimpanzee #1536, was inoculated witii smaller amounts of RNA which 
had been mixed with lipofectin. In this case, the same transcript RNAs from the 10 fall- 
length HCV-H77 clones were treated with DNasel to remove template DNA and 0.15 pig, 
0.5 //g, and 1.5 pig portions were diluted to 50 pl\ with PBS and stored at -80^C until used 
for inoculation. After thawing, 100 pel PBS containing 9 peg lipofectin (Besthesda Research 
Laboratory) was added to each sample, mixed, and injected into a single site. Hence, each 
clone/transcript preparation with different RNA/lipofectin ratios was injected at three 
separate sites. 

Serum samples and liver biopsies were taken pre-inoculation and at weekly intervals 
thereafter. For nearly two months post-inoculation, samples have been assayed for liver 
enzymes (ALT, ICD, GGTP) hepatitis virus serology, and vkemia by quantitative 
competitive RT-PCR [Kolykhalov et al„ (1996) supra]. 
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Evidence for successful initiation of infection and replication. The results of our analyses . 
thus far are summarized in Table 6. 



Table 6. Results of chimpanzee experiment III. 



Chimp 1535 (RNA-DNA IN PBS): 


week 


ALT 


ICD 


GGTP 


anti-HCV ab 


HCV RNA 

bDNA 

(Meg/ml 


QC RT-PCR 


-5 


43 


453 


28 


0.2 




- 


-2-3 


32 


325 


27 


0.1 




- 


-1 


36 


600 


27 


0.2 


- 


- 


0 


40 


430 


28 


0.1 


<0,2 


< lOVml 


1 


42 


490 


24 


0 


0.445 


IxlO^/ml 


2 


96C 


1000 


53 


0 


0.283 


3xl0*/ml 


3 


81C 


780 


55 


0 


0.593 


6xltf/ml 


4 . 


78 


640 


52 


0.2 


2.026 


IxlOVml 


5 


60 


510 


57 


0.1 


2.609 


2xlOVml 


6 


49 


670 


50 


0.1 


3.286 


T.B.D. 


7 


49 


525 


44 


0 


5.708 


T.B.D. 


8 


56 


485 


50 


.01 


T.B.D. 


T.B.D. 


9 


67 


500 


67 


0.1 


T.B.D/ 


T.B.D. 


10 


98 


725 


79 


0.2 


T.B.D. 


T.B.D. 


11 


86 


525 


85 


0.2 


T.B.D. 


T.B.D. 



Chin^) 1536 (RNA + lipofectin): 


week 


ALT 


ICD 


GGTP 


anti-HCV ab 


HCB RNA 

bDNA 

(Meg/ml) 


QC RT-PCR 


-9 


27 


368 


33 


0.1 






-5 


45/4 
5 


524/49 
6 


82n7R 


0.2 
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-2-3 


28 


375 


52 


0.1 


- 


- 


-1 


34 


475 


41 


0.1 


- 


- 


0 


36 


680 


44 


O.J 


<0.2 


< lOVml 


1 


45 


660 


42 


0 


<0.2 - 


ixlOVml 


2 


44 


875 


51 


0 


0.252 


3xl0Vml 


3 


49 


760 


55 


0 


0.469 


ixlOVml 


4 


41 


465 


52 


0.2 


0.862 


2xlOVml 


5 


42 


500 


49 


0.1 


0.904 


3xl0^/nil 


\j 


so 


fjK) 


60 


0.00 


1.489 


6xlOVmI 


7 


43 


490 


55 


0.1 


3.413 


T.B.D. 


8 


53 


700 


64 


0.1 


13.00 


T.B.D. 


9 


38 


505 


65 


O.I 


3.271 


T.B.D. 


10 


133 


1270 


120 


0.4 


T.B.D. 


T.B.D. 


11 


324 


1485 


258 


1.3 


T.B.D. 


T.B.D. 



Chimp 1557 (HAV RNA + DNA in PBS), positive control: 



week 


ALT 


ICD 


GGTP 


anti-HAV 


0 


33 


405 


19 


(-) 


1 


42 


360 


14 


(-) 


2 


33 


345 


16 


0.6 


3 


26 


520 


14 


0.7 


4 


62 


1330 


24 


3.5 


5 


43 


700 


28 


21.4 


6 


23 


650 


27 


27.9 


7 


22 


540 


25 


14.6 


8 


20 


490 


22 


T.B.D. 



^= repeated 
C = confinned 
T.B.D. = to be determined 



Chimp if^l535 showed a peak in liver enzymes at week 2 post-inoculation, which has 
gradually declined to the pre-inoculation baseline. At week 10, a second peak of liver 
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enzymes was observed. HCV RNA titers were below our detection limit pre-inoculation _ 
(< 10^), increased to 10^/ml by week 1, and continued to climb steadily reaching 2 x lOVml 
by week 5. This represents a 20-fold increase relative to week 1 . 

Chimp #1536 showed less evidence of early liver damage with only a minor peak in the 
ICD level at week 2 and fluctuating values thereafter. However, highly elevated levels of 
enzymes were observed in weeks 10 and 1 1 . The animal also became HCV-seropositive on 
weeks 10 and 11. On week 1, the HCV RNA titer was lOVml and has climbed to 6 x 
lOVml by week 6. This represents a 600-fold increase relative to week 1. 

The positive control inoculated with HAV transcripts (chimpanzee #1557) showed a sharp 
peak in liver enzymes on week 4 and had clearly seroconverted by this time. HAV-specific 
immunoreactivity increased sharply on week 5 and continued at high levels thereafter. 
These results show clear evidence of HAV infection and validate the inoculation method 
used for chimpanzee #1535. 

All of the samples analyzed for HCV RNA were also assayed for the presence of residual 
template DNA by omitting the enzyme in the reverse transcription step. No products were 
obtained, demonstrating that the signals detected in the quantitative competitive PCR assay 
were due to RNA (Figure 12). In addition, the HCV RNA containing material in these 
samples was resistant to RNase digestion under the same conditions that completely 
degraded naked competitor RNA mbced with serum being analyzed (Figure 13). These are 
the expected results if the RN As are packaged into enveloped RNase-resistant virus 
particles, as opposed to residual inoculated RNA. Moreover, the total amount of transcript 
RNA used for inoculation was - 3000 fxg for chimpanzee #1535 and only - 22 Mg for 
chimpanzee #1536. In spite of being inoculated with - 150-fold less RNA, chimpanzee 
#1536 showed higher levels of viremia than chimpanzee #1535. Thus the level of viremia 
does not correlate with input RNA, which is again indicative of virus amplification and 
spread. Finally, in the previous negative experiment using the non-consensus combinatorial 
library clones and the aGDD negative control (Example 3), 1000-2000 fig of HCV-specific 
RNA were inoculated per animal using similar procedures. No HCV RNA was detected at 
week 1 or thereafter, again suggesting that signal observed here is due to authentic virus 
replication and release into the serum. 
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Proof that the infections observed in these animals stemmed from the inoculated transcript . 
RNA was obtained by restriction enzyme and sequence analysis of recovered virus for the 
presence of engineered markers. Two silent mutations marked all of the transfected RNAs. 
These were the substitution at position 899 (C instead of T) and the substitution at position 
5936 (C instead of A) ablating the internal Bsml site (5934). For the nucleotide 899 
marker, the region between 466 to 950 was amplified by nested RT-PCR, sequenced 
direcdy, and shown to have die expected H77 sequence including the silent C (instead of T) 
marker at position 899. The region from 5801 to 6257 was also amplified by nested RT- 
PCR and shown to be resistant to digestion with Bsml, The expected digestion products 
were obtained, however, for four other enzymes cleaving in this region [Sstl (5923); BspKl 
(5944); Bsu36l (6209); Rsal (6244)1 of the H77 cDNA sequence. These analyses were 
conducted for both chimpanzee #1535 (week 5) and chimpanzee #1536 (week 6). 

The pathogenesis profiles for the RN A-inoculated animals are reminiscent of those obtained 
in previous experiments in which chimpanzees were inoculated with the H77 material or 
other HCV-containing samples. The course of this disease in chimpanzees, like man, is 
highly variable with respect to the extent of liver damage, progression to chronicity, level 
of viremia. and timing of seroconversion. 

Identification offiinctional "infectious" clones by evaluating silent markers present in virus 
recovered from infected animals. As detailed above, additional silent markers were 
incorporated in order to help identify the 5' terminal sequence(s) and the length(s) of poly 
(U/UC) tract which were required or preferred for initiating infection. 

Transcripts containing a single G (5'-GCCA...-3') were distinguished from those with 
additional 5' residues by the presence of the Xhol (514) silent marker in the C protein 
coding region. The region containing this marker was amplified by RT-PCR under 
conditions that ensured that a representative number of independent cDNAs were analyzed 
(greater than 50 in this case). The resulting products were analyzed for digestion wiUi 
either Xhol or as a control, Accl, an enzyme which should digest this fragment for all input 
clones. For chimpanzee #1535 (week 3 sample), the fraction of the products digested with 
JChol paralleled the input inoculum: approximately 20% was digested with Xhol (both 4 U 
and 30 U); 80% was resistant to digestion (values were determined by scanning ethidium 
bromide-stained digestion patterns with an ICIOOO Imaging System). Complete digestion 
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was observed for Accl. In the week 4 sample analyzed for chimpanzee #1536, 55% was 
digested with Xhol; 45% was resistant to digestion. Again, complete digestion was 
observed for Accl. Thus, in the second animal an advantage was observed for transcripts 
with only a single G (5'-GCCA...-3'). Although it is not possible to draw firm quantitative 
conclusions from these data regarding possible differences in specific infectivity, the results 
clearly demonstrate diat the transcripts without additional nucleotides are infectious (clones 
p90/HCVFLlong pU and p91/HCVFLshort pU). Furthermore, transcripts with additional 
nucleotides can also initiate infection, although our analysis thus far does not allow us to 
distinguish among the various clones. 

Transcripts containing "short" or "long" poly (U/UC) tracts were distinguished by the silent 
marker at position 8054 of the NS5B coding region. The region between 7955 and 8088 
was amplified by RT-PCR, using enough cDNA to ensure the amplification of greater than 
100 independent cDNA molecules, and molecularly cloned. Sequences of ten and nine 
independent clones were determined for chimpanzee #1535 (week 3) and chimpanzee #1536 
(week 4), respectively. Nine of ten clones (90%) for chimpanzee #1535 contained the G at 
position 8054, indicative of the "long" poly (U/UC) tract. Six of nine clones (66%) for 
chimpanzee #1536 contained the G at posidon 8054, indicative of the "long*' poly (U/UC) 
tract. The results demonstrate that transcripts containing either "short" or "long" poly 
(U/UC) tracts are infectious but that the "long" poly (U/UC) tract appears to be preferred. 
We can not, however, rule out the possibility that this effect is due to deleteriom effects of 
the marker mutation at 8054. These additional analyses provide further confirmation that 
the viremia observed in these animals was initiated by transcripts derived from our full- 
length clones. 

The functional genotype la cDNA clones described in this Example, or functional clones 
for other HCV genotypes (constructed and verified using similar methods), have a variety 
of applications for development of (i) more effective HCV dierapies; (ii) HCV vaccines; 
(iii) HCV diagnostics; and (iv) HCV-based gene expression vectors. 



EXAMPLE 5: Productive HCV Infection of a Hepatocvte Line 
The EcoJd-BstBl fraginent from pCEN was cloned into the unique Sfil site of 
p90/HCVFLlong pU. Prior to ligation, protruding termini were blunt ended using 
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T4 DNA polymerase in the presence of dNTPs. The EcoRl-BstBl fragment from pCEN 
contains the EMCV IRES element followed by the neomycin-resistance (NEO) coding 
region. This IRES NEO cassette is essentially identical to that described in Ghattas et 
al. [MoL Cell. Biol 11:5848 (1991)]. A clone containing this cassette in the correct 
orientation (positive-sense with respect to HCV genome RNA) was identified by 
digestion with appropriate restriction enzymes. 

EMCV IRES NEO cassette was inserted into the Sfil site in the 3' NTR of p90/HCVFL long 
pU. This transcribed RNA was used to transfect a human hepatocyie cell line, which was then 
selected for neomycin resistance using G418. Most cells died, but a G418 population grew up 
over the course of a few months. Remarkably, HCV RNA appears to be still present in these 
cells at a copy number of - 1000 RNA molecules per cell. It is believed that the neomycin 
resistance is mediated by HCV RNA because there is no evidence for integration of 
contaminating template DNA in the genome of these cells. 

The present invention is not to be limited in scope by the specific embodiments described 
herein. Indeed, various modifications of the invention in addidon to those described herein will 
become apparent to those skilled in the art from the foregoing description and the 
accompanying figures. Such modifications are intended to fall within the scope of the 
appended claims. 

It is further to be understood that all base sizes or amino acid sizes, and all molecular weight or 
molecular mass values, given for nucleic acids or polypeptides are approximate, and are 
provided for description. 

Various publications are cited herein, the disclosures of which are incorporated by reference in 
their entireties. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

<i) APPLICANT: Rice, Charles et al. 

(ii) TITLE OF INVENTION: FUNCTIONAL DNA CLONE FOR HEPATITIS C 
VIRUS (HCV) AND USES THEREOF 

(iii) NUMBER OF SEQUENCES: 21 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: David A. Jackson, Esq. 

(B) STREET: 411 Hackensack Ave, Continental Plaza, 4th 

Floor 

(C) CITY: Hackensack 

(D) STATE: New Jersey 

(E) COUNTRY: USA 

(F) ZIP: 07601 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1*0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER : US - 

(B) FILING DATE: 03-MAR-1997 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Jackson Esq., David A. 

(B) REGISTRATION NUMBER: 26,742 

(C) REFERENCE /DOCKET NUMBER: 1113-1-006 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 201-487-5800 

(B) TELEFAX: 201-343-1684 



(2) INFORMATION FOR SEQ ID N0:1: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9646 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

[GCCAGCqfccC TGATGGGGGC GACACTCCAC CATGAATCAC TCCCCTGTGA GGAACTACTG 60 

TCTTCACGCA GAAAGCGTCT AGCCATGGCG TTAGTATGAG TGTCGTGCAG CCTCCAGGAC 120 

^ CCCCCCTCCC GGGAGAGCCA TAGTGGTCTG C6GAACCGGT GAGTACACCG GAATTGCCAG 180 

GACGACCGG6 TCCTTTCTTG GATAAACCCG CTCAATGCCT GGAGATTTGG GCGTGCCCCC 240 

GCAAGACTGC TAGCCGAGTA GTGTTGGGTC GCGAAAGGCC TTGTGGTACT GCCTGATAGG 300 

GTGCTTGCGA GTGCCCCGGG AGGTCTCGTA GACCGTGCAC CATGAGCACG AATCCTAAAC 360 

CTCAAAGAAA AACCAAACGT AACACCAACC GTCGCCCACA GGACGTCAAG TTCCCGGGTG 420 

GCGGTCAGAT CGTTGGTGGA GTTTACTTGT TGCCGCGCAG GGGCCCTAGA TTGGGTGTGC 480 

GCGCGACGAG GAAGACTTCC GAGCGGTCGC AACCTCGAGG TAGACGTCAG CCTATCCCCA 540 

AGGCACGTCG GCCCGAGGGC AGGACCTGGG CTCAGCCCGG GTACCCTTGG CCCCTCTATG 600 

GCAATGAGGG TTGCGGGTGG GCGGGATGGC TCCTGTCTCC CCGTGGCTCT CGGCCTAGCT 660 

GGGGCCCCAC AGACCCCCGG CGTAGGTC6C GCAATTTGGG TAAGGTCATC GATACCCTTA 720 

CGTGCGGCTT CGCCGACCTC ATGGGGTACA TACOGCTCGT CGGCGCCCCT CTTGGAGGCG 780 

CTGCCAGGGC CCTGGCGCAT GGCGTCCGGG TTCTGGAAGA CGGCGTGAAC TATGCAACAG 840 

GGAACCTTCC TGGTTGCTCT TTCTCTATCT TCCTTCTGGC CCTGCTCTCT TGCCTGACTG 900 

TGCCCGCTTC AGCCTACCAA GTGCGCAATT CCTCGGGGCT TTACCATGTC ACCAATGATT 960 

GCCCTAACTC GAGTATTGTG TACGAGGCGG CCGATGCCAT CCTGCACACT CCGGGGTGTG 1020 

TCCCTTGCGT TCGCGAGGGT AACGCCTCGA GGTGTTGGGT GGCGGTGACC CCCACGGTGG 1080 

CCACCAGGGA CGGCAAACTC CCCACAACGC AGCTTCGACG TCATATCGAT CTGCTTGTCG 1140 

GGAGCGCCAC CCTCTGCTCG GCCCTCTACG TGGGGGACCT GTGCGGGTCT GTCTTTCTTG 1200 
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TTGGTCAACT GTTTACCTTC TCTCCCAGGC GCCACTGGAC GACGCTVAGAC TGCAATTGTT 1260 

CTATCTATCC CGGCCATATA ACGGGTCATC GCATGGCATG GGATATGATG ATGAACTGGT 1320 

CCCCTACGGC AGCGTTGGTG GTAGCTCAGC TGCTCCGGAT CCCACAAGCC ATCATGGACA 1380 

TGATCGCTGG TGCTC7\.CTGG GGAGTCCTGG CGGGCATAGC GTATTTCTCC ATGGTGGGGA 1440 

ACTGGGCGAA GGTCCTGGTA GTGCTGCTGC TATTTGCCGG CGTCGACGCG GAAACCCACG 1500 

TCACCGGGGG AAGTGCCGGC CGCACCACjSG CTGGGCTTGT TGGTCTCCTT ACACCAGGCG 1560 

CCAAGCAGAA CATCCAACTG ATCAACACCA ACGGCAGTTG GCACATCAAT AGCACGGCCT 1620 

TGAACTGCAA TGAAAGCCTT AACACCGGCT GGTTAGCAGG GCTCTTCTAT CAGCACAAAT 1680 

TCAACTCTTC AGGCTGTCCT GAGAGGTTGG CCAGCTGCCG ACGCCTTACC GATTTTGCCC 1740 

AGGGCTGGGG TCCTATCAGT TATGCCAACG GAAGCGGCCT CGACGAACGC CCCTACTGCT 1800 

GGCACTACCC TCCAAGACCT TGTGGCATTG TGCCCGCAAA GAGCGTGTGT GGCCCGGTAT 1860 

ATTGCTTCAC TCCCAGCCCC GTGGTGGTGG GAACGACCGA CAGGTCGGGC GCGCCTACCT 1920 

ACAGCTGGGG TGCAAATGAT ACGGATGTCT TCGTCCTTAA CAACACCAGG CCACCGCTOG 1980 

GCAATTGGTT CGGTTGTACC TGGATGAACT CAACTGGATT CACCAT^GTG TGCGGAGCGC 2040 

CCCCTTGTGT CATCGGAGGG GTGGGCAACA ACACCTTGCT CTGCCCCACT GATTGTTTCC 2100 

GCAAQCATCC GGAAGCCACA TACTCTCX3GT GCGGCTCCGG TCCCTGGATT ACACCCAGGT 2160 

GCATGGTCGA CTACCCGTAT AGGCTTTGGC ACTATCCTTG TACCATCAAT TACACCATAT 2220 

TCAAAGTCAG GATGTACGTG GGAGGGGTCG AGCACAGGCT GGAAGCGGCC TGCAACTGGA 2280 

CGCGGGGCGA ACGCTGTGAT CTGGAAGACA GGGACAGGTC CGAGCTCAGC CCATTGCTGC 2340 

TGTCCACCAC ACAGTGGCAG GTCCTTCCGT GTTCTTTCAC GACCCTGCCA GCCTTGTCCA 2400 

CCGGCCTCAT CCACCTCCAC CAGAACATTG TGGACGTGCA GTACTTGTAC GGGOTAGGGT 2460 

CAAGCATCGC GTCCTGGGCC ATTAAGTGGG AGTACGTCGT TCTCCTGTTC CrTCCTGCTTG 2520 

CAGACGCOCG CGTCTGCTCC TGCTTGTGGA TGATGTTACT CATATCCCAA GCGGAGGCGG 2580 

CTTTGGAGAA CCTCGTAATA CTCAATGCAG CATCCCTGGC CGGGACGCAC GGTCTTGTGT 2640 

CCTTCCTCGT GTTCTTCTGC TTTGCGTGGT ATCTGAAGGG TAGGTGGGTG CCCGGAGCGG 2700 

TCTACX5CCTT CTACGGGATG TGGCCTCTCC TCCTGCTCCT GCTGGCGTT6 CCTCAGCGGG 2760 
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CATACGCACT 


GGACACGGAG 


GTGGCCGCGT CGTGTGGCGG CGTTGTTCTT GTCGGGTTAA 


2620 


TGGCGCTGAC 


TCTGTCGCCA 


TATTACAAGC 


GCTACATCAG CTGGTGCATG TGGTGGCTTC 


2880 


AGTATTTTCT 


GACCAGAGTA 


GAAGCGCAAC 


TGCACGTGTG GGTTCCCCCC CTCAACGTCC 


2940 


GGGGGGGGCG 


CGATGCCGTC 


ATCTTACTCA 


TGTGTGTTGT ACACCCGACT CTGGTATTTG 


3000 


ACATCACCAA 


ACTACTCCTG 


GCCATCTTCG 


GACCCCTTTG GATTCTTCAA GCCAGTTTGC 


3060 


TTAAAGTCCC 


CTACTTCGTG 


CGCGTTCAAG 


GCCTTCTCCG GATCTGCGCG CTAGCGCGGA 


3120 


AGATAGCCGG 


AGGTCATTAC 


GTGCAAATGG 


CCATCATCAA GTTAGGGGCG CTTACTGGCA 


3180 


CCTATGTGTA 


TAACCATCTC 


ACCCCTCTTC 


GAGACTGGGC GCACAACGGC CTGCGAGATC 


3240 


TGGCCGTGGC 


TGTGGAACCA 


GTCGTCTTCT 


CCCGAATGGA GACCAAGCTC ATCACGTGGG 


3300 


GGGCAGATAC 


CGCCGCGTGC 


GGTGACATCA 


TCAACGGCTT GCCCGTCTCT GCCCGTAGGG 


3360 


GCCAG6AGAT 


ACTGCTTGGG 


CCAGCCGACG 


GAATGGTCTC CAAGGGGTGG AGGTTGCTGG 


3420 


CGCCCATCAC 


GGCGTACGCC 


CAGCAGACGA GAGGCCTCCT AGGGTGTATA ATCACCAGCC 


3480 


TGACTGGCCG 


GGACAAAAAC 


CAAGTGGAGG GTGAGGTCCA 6ATCGTGTCA ACTGCTACCC 


3540 


AAACCTTCCT 


GGCAACGTGC 


ATCAATGGGG 


TATGCTGGAC TGTCTACCAC GGGGCCGGAA 


3600 


CGAGGACCAT 


CGCATCACCC 


AAGGGTCCTG 


TCATCCAGAT GTATACCAAT GTGGACCAAG 


3660 


ACCTTGTG6G 


CTGGCCCGCT 


CCTCAAGGTT 


CCCGCTCATT GACACCCTGC ACCTGCGGCT 


3720 


CCTCGGACCT 


TTACCTGGTC 


ACGAGGCACG 


CCGATGTCAT TCCCGTGCGC CGGCGAGGTG 


3780 


ATAGCAGGGG 


TAGCCTGCTT 


TCGCCCCGGC 


CCATTTCCTA CTTGAAAGGC TCCTCGGGG6 


3840 


GTCCGCTGTT 


GTGCCCCGCG 


GGACACGCCG 


TGGGCCTATT CAGGGCCGCG GTGTGCACCC 


3900 


GTGGAGTGGC 


TAAGGCGGTG 


GACTTTATCC 


CTGTGGAGAA CCTAGAGACA ACCATGAGAT 


3960 


CCCCGGTGTT 


CAOGGACAAC 


TCCTCTCCAC 


CAGCAGTGCC CCAGAGCTTC CAGGTGGCCC 


4020 


ACCTGCATGC TCCCACCGGC AGCGGTAAGA GCACCAAGGT CCCGGCTGCG TACGCAGCCC 


4080 


AGGGCTACAA GGTGTTG6TG 


CTCAACCCCT 


CTGTTGCTGC AACGCTGGGC TTTGGTGCTT 


4140 


ACATGTCCAA 


GGCCCATGGG 


GTTGATCCTA 


ATATCAGGAC CGGGGTGAGA ACAATTACCA 


4200 


CTGGCAGCCC 


CATCACGTAC 


TCCACCTACG GCAAGTTCCT TGCCGACX3GC GGGTGCTCAG 


4260 


GAGGTGCTTA TGACATAATA ATTTGTGACG AGTGCCACTC CACGGATGCC ACATCCATCT 


4320 
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TGGGCATCGG 


CACTGTCCTT GACCAAGCAG 


AGACTGCGGG 


GGCGAGACTG 


GTTGTGCTCG 


4380 


CCACTGCTAC 


CCCTCCGGGC TCCGTCACTG 


TGTCCCATCC 


TAACATCGAG 


GAGGTTGCTC 


4440 


TGTCCACCAC 


CGGAGAGATC CCTTTTTACG 


GCAAGGCTAT 


CCCCCTCGAG 


GTGATCAAGG 


4500 


GGGGAAGACA 


TCTCATCTTC TGCCACTCAA AGAAGAAGTG 


CGACGAGCTC 


GCCGCGAAGC 


4560 


TGGTCGCATT 


GGGCATCAAT GCCGTGGCCT 


ACTACCGCGG 


TCTTGACGTG 


TCTGTCATCC 


4620 


CGACCAGCGG 


CGATGTTGTC GTCGTGTCGA 


CCGATGCTCT 


CATGACTGGC 


TTTACCGGCG 


4680 


ACTTCGACTC 


TGTGATAGAC TGCAACACGT 


GTGTCACTCA 


GACAGTCGAT 


TTCAGCCTTG 


4740 


ACCCTACCTT 


TACCATTGAG ACAACCACGC TCCCCCAGGA TGCTGTCTCC AGGACTCAAC 


4800 


GCCGGGGCAG 


GACTGGCAGG G6GAAGCCAG 


GCATCTACAG 


ATTTGTGGCA 


CCGGGGGAGC 


4860 


GCCCCTCCGG 


CATGTTCGAC TCGTCCGTCC 


TCTGTGAGTG 


CTATGACGCG 


GGCTGTGCTT 


4920 


GGTATGAGCT 


CACGCCCGCC GAGACTACAG 


TTAGGCTACG 


AGCGTACATG 


AACACCCCGG 


4980 


GGCTTCCCGT 


GTGCCAGGAC CATCTTGAAT 


TTTGGGAGGG 


CGTCTTTACG 


GGCCTCACTC 


5040 


ATATAGATGC 


CCACTTTCTA TCCpiGACAA AGCAGAGTGG GGAGAACTTT CCTTACCTGG 


5100 


TAGCGTACCA 


AGCCACCGTG TGCGCTAGGG 


CTCAAGCCCC 


TCCCCCATCG 


TGGGACCAGA 


5160 


TGTGGAAGTG 


TTTGATCCGC CTTAAACCCA 


CCCTCCATGG 


GCCAACACCC 


CTGCTATACA 


5220 


GACTGGGCGC 


TGTTCAGAAT GAAGTCACCC 


TGAC6CACCC 


AATCACCAAA 


TACATCATQA 


5280 


CATGCATGTC 


GGCCGACCTG GAGGTCGTCA 


CGAGCACCTG 


GGTGCTCGTT 


GGCGGCGTCC 


5340 


TGGCTGCTCT 


GGCCGCGTAT TGCCTGTCAA 


CAGGCTGCGT 


GGTCATAGTG 


GGCAGGATTG 


5400 


TCTTGTCCGG 


GAAGCCGGCA ATTATACCTG 


ACAGGGAGGT 


TCTCTACCAG 


GAGTTCGATG 


5460 


AGATGGAAGA GTGCTCTCAG CACTTACCGT ACATCGAGCA AGGGATGATG 


CTCGCTGAGC 


5520 


AGTTCAAGCA 


QAAGGCCCTC GGCCTCCTGC 


AGACCGCGTC 


CCGCCAAGCA GAGGTTATCA 


5580 


CCCCTGCTGT 


CCAGACCAAC TGGCAGAAAC 


TCGAGGTCTT 


CTGGGCX3AAG 


CACATGTGGA 


5640 


ATTTCATCAG 


TGGGATACAA TACTTGGCGG 


GCCTGTCAAC 


GCTGCCTGGT AACCCCGCCA 


5700 


TTGCTTCATT 


GATGGCTTTT ACAGCTGCCG 


TCACCAGCCC 


ACTAACCACT 


GGCCAAACCC 


5760 


TCCTCTTCAA CATATTGGGG GGGTGGGTGG 


CTGCCCAGCT 


CGCCGCCCCC 


GGTGCCGCTA 


5820 


CCGCCTTTGT GGGCGCTGGC TTAGCTGaC?6 


CCQCCATCGG 


CAGCGTTGGA 


CTGGGGAAGG 


5880 
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TCCTCGTGGA CATTCTTGCA GGGTATGGCG CGGGCGTGGC GGGAGCTCTT GTAGCATTCA 5940 

AGATCATGAG CGGTGAQGTC CCCTCCACGG AGGACCTGGT CAATCTGCTG CCCGCCATCC 6000 

TCTCGCCTGG AGCCCTTGTA GTCGGTGTGG TCTGC6CAGC AATACTGCGC CGGCACGTTG 6060 

GCCCGGGCGA GGGGGCAGTG CAATGGATGA ACCGGCTAAT AGCCTTCGCC TCCCGGGGGA 6120 

ACCATGTTTC CCCCACGCAC TACGTGCCGG AGAGCGATGC AGCCGCCCGC GTCACTGCCA 6180 

TACTCAGCAG CCTCACTGTA ACCCAGCTCC TGAGGCQACT GCATCAGTGG ATAAGCTCGG 6240 

AGTGTACCAC TCCATGCTCC GGTTCCTGGC TAAGGGACAT CTGGGACTGG ATATGCGAGG 6300 

TGCTGAGCGA CTTTAAGACC TGGCTGAAAG CCAAGCTCAT GCCACAACTG CCTGGGATTC 6360 

CCTTTGTGTC CTGCCAGCGC GGGTATAGGG GGGTCTGGCG AGGAGACGGC ATTATGCACA 6420 

CTCGCTGCCA CTGTGGAGCT GAGATCACTG GACATGTCAA AAACGGGACG ATGAGGATCG 6480 

TCGGTCCTAG GACCTGCAGG AACATGTGGA GTGGGACGTT CCCCATTAAC GCCTACACCA 6540 

CGGGCCCCTG TACTCCCCTT CCTGCGCCGA ACTATAAGTT CGCGCTGTGG AGGGTGTCTG 6600 

CAGAGGAATA CGTGGAGATA AGGC6GGTGG GGGACTTCCA CTACGTATCG GGTATGACTA 6660 

CTGACAATCT TAAATGCCCG TGCCAGATCC CATCGCCCGA ATTTTTCACA GAATTGGACG 6720 

6GGTGCGCCT ACATAGGTTT GCGCCCCCTT GCAAGCCCTT GCTGCGGGAG GAGGTATCAT 6780 

TCAGAGTAGG ACT.CCACGAG TACCCGGTGG GGTCGCAATT ACCTTGCGAG CCCGAACCGG 6840 

AC6TAGCCGT GTTGACGTCC ATGCTCACTG ATCCCTCCCA TATAACAGCA GAGGCGGCCG 6900 

GGAGAAGGTT GGCGAGAGGG TCACCCCCTT CTATGGCCAG CTCCTCGGCC AGCCAGCTGT 6960 

CCGCTCCATC TCTCAAGGCA ACTTGCACCG CCAACCATGA CTCCCCTGAC GCCGAGCTCA 7020 

TAGAGGCTAA CCTCCTGTGG AGGCAGGAGA TGGGCGGCAA CATCACCAGG GTT6AGTCAG 7080 

AGAACAAAGT GGTGATTCTG GACTCCTTCG ATCCGCTTGT GGCAGAGGAG GATGAGCGGG 7140 

AGGTCTCCGT ACCCGCAGAA ATTCTGCGGA AGTCTOGGAG ATTCGCCCGG 6CCCT6CCCG 7200 

TTTGGGCGCG GCCGGACTAC AACCCCCCGC TAGTAGAGAC GTGGAAAAAG CCTGACTACG 7260 

AACCACCTGT GGTCCATGGC TGCCCGCTAC CACCTCCACG GTCCCCTCCT GTGCCTCCGC 7320 

CTCGGAAAAA GCGTACGGTG 6TCCTCACCG AATCAACCCT ATCTACTGCC TTGGCCGAGC 7380 

TTGCCACCAA AAGTTTTGGC AGCTCCTCAA CTTCCGGCAT TACGGGCGAC AATACGACAA 7440 
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CATCCTCTGA GCCCGCCCCT TCTGGCTGCC CCCCCGACTC CGACGTTGAG TCCTATTCTT 
CCATGCCCCC CCTGGAGGGG GAGCCTGGGG ATCCGGATCT CAGCGACGGG TCATGGTCGA 
CGGTCAGTAG TGGGGCCGAC ACGGAAGATG TCGTGTGCTG CTCAATGTCT TATTCCTGGA 
CAGGCGCACT CGTCACCCCG TGCGCTGCGG AAGAACAA7A ACTGCCCATC AACGCACTGA 
GCAACTCGTT GCTACGCCAT CACAATCTGG TGTATTCCAC CACTTCACGC AGTGCTTGCC 
AAAGGCAGAA GAAAGTCACA TTTGACAGAC TGCAAGTTCT GGACAGCCAT TACCAGGACG 
TGCTCAAGGA GGTCT^GCA GCGGCGTCAA AAGTGAAGGC TAACTTGCTA TCCGTAGAGG 
AAGCTTGCAG CCTGACGCCC CCACATTCAG CCAAATCCAA GTTTGGCTAT GGGGCAAAAG 
ACGTCCGTTG CCATGCCAGA AAGGCCGTAG CCCACATCAA CTCCGTGTGG AAAGACCTTC 
TGGAAGACAG TGTAACACCA ATAGACACTA CCATCATGGC CAAGAACGAG GTTTTCTGCG 
TTCAGCCTGA GAAGGGGGGT CGTAAGCCAG CTCGTCTCAT CGTGTTCCCC GACCTGGGCG 
TGCGCGTGTG CGAGAAGATG GCCCTGTACG ACGTGGTTAG CAAGCTCCCC CTGGCCGTGA 
TGGGAAGCTC CTACGGATTC CAATACTCAC CAGGACAGCG GGTTGAATTC CTCGTGCAAG 
CGTGGAAGTC CAAGAAGACC CCGATGGGGT TCTCGTATGA TACCCGCTGT TTTGACTCCA 
CAGTCACTGA GAGCGACATC CGTACGGAGG AGGCAATTTA CCAATGTTGT GACCTGGACC 
CCCAAGCCCG CGTGGCCATC AAGTCCCTCA CTGAGAGGCT TTATGTTGGG GGCCCTCTTA 
CCAATTCAAG GGGGGAT^C TGCGGCTACC GCAGGTGCCG CGCGAGCGGC GTACTGACAA 
CTAGCTGTGG TAACACCCTC ACTT6CTACA TCAAGGCCCG GGCAGCCTGT C6AGCCGCAG 
GGCTCCAGGA CTGCACCATG CTCGTGTGTG GCGACGACTT AGTCGTTATC TGTGA7AGTG 
CGGGGGTCCA GGAGGACGCG GCGAGCCTGA GAGCCTTCAC GGAGGCTATG ACCAGGTACT 
CC6CCCCCCC CGGGGACCCC CCACAACCAG AATACGACTT GGAGCTTATA ACATCATGCT 
CCTCCAACGT GTCAGTCX3CC CACGACGGCG CTGGAAAGAG GGTCTACTAC CTTACCCGTG 
ACCCTACAAC CCCCCTCGCG AGAGCCGCGT GGGAGACA6C AAGACACACT CCAGTCAATT 
CCTGGCTAGG CAACATAATC ATGTTTGCCC CCACACTGTG GGCGAGGATG ATACTGATGA 
CCCATTTCTT TAGCGTCCTC ATAGCCAGGG ATCAGCTTGA ACAGGCTCTT AACTGTGAGA 
TCTACGGAGC CTGCTACTCC ATAGAACCAC TGGATCTACC TCCAATCATT CAAAGACTCC 
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ATGGCCTCAG CGCATTTTCA CTCCACAGTT ACTCTCCAGG TGAAATCAAT AGGGTGGCCG 9060 

CATGCCTCAG AAAACTTGGG GTCCCGCCCT TGCGAGCTTG GAGACACCGG GCCCGGAGCG 9120 

TCCGCGCTAG GCTTCTGTCC AGAGGAGGCA GGGCTGCCAT ATGTGGCAAG TACCTCTTCA 9180 

ACTGGGCAGT AAGAACAAAG CTCAAACTCA CTCCAATAGC GGCCGCTGGC CGGCTGGACT 9240 

TGTCCGGTTG GTTCACGGCT GGCTACAGC?G G6GGAGACAT TTATCACAGC GTGTCTCATG 9300 

CCCGGCCCCG CTGGTTCTGG TTTTGCCTAC TCCTGCTCGC TGCAGGGGTA GGCATCTACC 9360 

TCCTCCCCAA CCGATGAAGG TTGGGGTTIAA CACTCCGGCC TCTTAGGCCA TTTCCTGTTT 9420 

TTTTTTTTTT tTTTTTTTTT TTTTTTTTTT tTTTTTTTTT TTTTTTTTCT TTTTTTTTTT 9480 

TTTTTTCCTT TTTTTTTTTT TTTTTTTTCT TTCCTTCTTT TTTCCTTTCT TTTCCTTCCT 9540 

TCTTTAATGG TGGCTCCATC TTAGCCCTAG TCACGGCTAG CTGTGAAAGG TCCGTGAGCC 9600 

GCATGACTGC AGAGAGTGCT GATACTGGCC TCTCTGCAGA TCATGT 9646 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQtJENCE CHARACTERISTICS: 

(A) LENGTH: 3012 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii)' MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: N- terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro Arg Gly Arg Arg Gin Pro 
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50 55 60 

He Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
100 105 . 110 

Arg Arg Arg Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val Gly Ala Pro Leu 
130 135 140 

Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 

Gly Val Asn Tyr Ala Thr Gly Ash Leu Pro Gly Cys Ser Phe Ser He 
165 170 175 

Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr 
180 185 190 

Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val Thr Asn Asp Cys Pro 
195 200 205 

Asn ser Ser He Val Tyr Glu Ala Ala Asp Ala He Leu His Thr Pro 
210 215 220 

Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ala Ser Arg Cys Trp Val 
225 230 235 240 

Ala Val Thr Pro Thr Val Ala Thr Arg Asp Gly Lys Leu Pro Thr Thr 
245 250 255 

Gin Leu Arg Arg His He Asp Leu Leu Val Gly Ser Ala Thr Leu Cys 
260 265 270 

Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Val Gly 
275 280 285 

Gin Leu Phe Thr Phe Ser Pro Arg Arg His Trp Thr Thr Gin Asp Cys 
290 295 300 

Asn Cys Ser He Tyr Pro Gly His He Thr Gly His Arg Met Ala Trp 
305 310 315 320 



Asp Met Met Met Asn Trp Ser Pro Thr Ala 



Ala Leu Val Val Ala Gin 





wo 98/39031 



PCT/USSW/04428 



106 



325 



330 



335 



Leu Leu Arg He Pro Gin Ala He Met Asp Met He Ala Gly Ala His 
340 345 350 

Trp Gly Val Leu Ala Gly He Ala Tyr Phe Ser Met Val Gly Asn Trp 
355 360 365 ' 

Ala Lys Val Leu Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala Glu 
370 375 380 

Thr His Val Thr Gly Gly Ser Ala Gly Arg Thr Thr Ala Gly Leu Val 
385 390 395 400 

Gly Leu Leu Thr Pro Gly Ala Lys Gin Asn He Gin Leu He Asn Thr 
405 410 415 

Asn Gly Ser Trp His He Asn Ser Thr Ala Leu Asn Cys Asn Glu Ser 
420 425 430 

Leu Asn Thr Gly Trp Leu Ala Gly Leu Phe Tyr Gin His Lys Phe Asn 
435 440 445 

Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys Arg Arg Leu Thr Asp 
450 455 460 

Phe Ala Gin Gly Trp Gly Pro He Ser Tyr Ala Asn Gly Ser Gly Leu 
465 470 475 480 

Asp Glu Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Pro Cys Gly He 
485 490 495 

Val Pro Ala Lys Ser Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser 
500 505 510 

Pro Val Vial Val Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr Tyr Ser 
515 520 525 

Trp Gly Ala Asn Asp Thr Asp Val Phe Val Leu Asn Asn Thr Arg Pro 
530 535 540 

Pro Leu Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Ser Thr Gly Phe 
545 550 555 560 

Thr Lys Val Cys Gly Ala Pro Pro Cys Val He Gly Gly Val Gly Asn 
565 570 575 

Asn Thr Leu Leu Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala 
580 585 590 

Thr Tyr Ser Arg Cys Gly Ser Gly Pro Trp He Thr Pro Arg Cys Met 
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595 600 605 

Val Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr lie Asn Tyr 
610 615 620 

Thr lie Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg Leu 
625 630 635 640 

Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu Glu Asp 
645 650 655 

Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Gin Trp 
660 665 670 

Gin Val Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser Thr Gly 
675 680 685 

Leu lie His Leu His Gin Asn He Val Asp Val Gin Tyr Leu Tyr Gly 

690 695 700 

Val Gly Ser Ser He Ala Ser Trp Ala He Lys Trp Glu Tyr Val Val 
705 710 715 720 

Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys Ser Cys Leu Trp 
725 730 735 

Met Met Leu Leu He Ser Gin Ala Glu Ala Ala Leu Glu Asn Leu Val 
740 745 750 

He Leu Asri Ala Ala Ser Leu Ala Gly Thr His Gly Leu Val Ser Phe 
755 760 765 

Leu Val Phe Phe Cys Phe Ala Trp Tyr Leu Lys Gly Arg Trp Val Pro 
770 775 780 

Gly Ala Val Tyr Ala Phe Tyr Gly Met Trp Pro Leu Leu Leu Leu Leu 
785 790 795 800 

Leu Ala Leu Pro Gin Arg Ala Tyr Ala Leu Asp Thr Glu Val Ala Ala 
•805 810 815 

Ser Cys Gly Gly Val Val Leu Val Gly Leu Met Ala Leu Thr Leu Ser 
820 825 830 

Pro Tyr Tyr Lys Arg Tyr He Ser Trp Cys Met Trp Trp Leu Gin Tyr 
835 840 845 

Phe Leu Thr Arg Val Glu Ala Gin Leu His Val Trp Val Pro Pro Leu 
850 855 860 

Asn Val Arg Gly Gly Arg Asp Ala Val He Leu Leu Met Cys Val Val 
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870 875 880 

His Pro Thr Leu Val Phe Asp lie Thr Lys Leu Leu Leu Ala He Phe 
885 890 895 

Gly Pro Leu Trp He Leu Glix Ala Ser Leu Leu Lys Val Pro Tyr Phe 

900 905 

Val Arg Val Gin Gly Leu Leu Arg He Cys Ala Leu Ala Arg Lys He 
915 920 925 

Ala Gly Gly His Tyr Val Gin Met Ala He He Lys Leu Gly Ala Leu 
930 935 940 

Thr Gly Thr Tyr Val Tyr Asn His Leu Thr Pro Leu Arg Asp Trp Ala 

950 955 960 

His Asn Gly Leu Arg Asp Leu Ala Val Ala Val Glu Pro Val Val Phe 
965 970 

Ser Arg Met Glu Thr Lys Leu He Thr Trp Gly Ala Asp Thr Ala Ala 

980 985 990 

cys Gly Asp He He Asn Gly Leu Pro Val Ser Ala Arg Arg Gly Gin 
995 1000 1005 

Glu He Leu Leu Gly Pro Ala Asp Gly Met Val Ser Lys Gly Tm Ara 
1010 1015 1020 

Leu Leu Ala Pro He Thr Ala Tyr Ala Gin Gin Thr Arg Gly Leu Leu 
^°25 1030 1035 1040 

Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu 
1045 1050 1055 

Gly Glu Val Gin He Val Ser Thr Ala Thr Gin Thr Phe Leu Ala Thr 
1060 1065 1070 

Cys He Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Thr Arg 
1075 1080 1085 

Thr He Ala Ser Pro Lys Gly Pro Val He Gin Met Tyr Thr Asn Val 
1090 1095 1100 

Asp Gin Asp Leu val Gly Trp Pro Ala Pro Gin Gly Ser Arg ser Leu 

"10 1115 1120 

Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His 
1125 1130 1135 

Ala Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu 
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1140 1145 115D 

Leu Ser Pro Arg Pro lie Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro 
1155 1160 1165 

Leu Leu Cys Pro Ala Gly His Ala Val Gly Leu Phe Arg Ala Ala Val 
1170 1175 1180 

Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe lie Pro Val Glu Asn 
1185 1190 1195 1200 

Leu Glu Thr Tbr Met Arg Ser Pro Val Phe Thr Asp Asn S6r Ser Pro 
1205 1210 1215 

Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr 
1220 1225 1230 

Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly 
1235 1240 1245 

Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe 
1250 1255 1260 

Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro Asn lie Arg Thr 
1265 1270 1275 1280 

Gly Val Arg Thr lie Thr Thr Gly Ser Pro lie Thr Tyr Ser Thr Tyr 
1285 1290 1295 

Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He 
1300 1305 1310 

He He Cys Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly 
1315 1320 1325 

He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val 
1330 1335 1340 



Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Ser His Pro 
1345 1350 1355 1360 

Asn He Glu Glu Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe Tyr 
1365 1370 1375 

Gly Lys Ala He Pro Leu Glu Val He Lys Gly Gly Arg His Leu He 
1380 1385 1390 

Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val 
1395 1400 1405 

Ala Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser 
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1410 1415 ^420 

val He Pro Thr Ser Gly Asp Val Val Val Val Ser Thr Asp Ala Leu 
^'^2= "30 1435 

Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr 
"45 1450 1455 

Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He 
1460 1465 1470 

Glu Thr Thr Thr Leu Pro Gin Asp Ala Val Ser- Arg Thr Gin Arg Arq 
1475 1480 1485 

Gly Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg Phe Val Ala Pro 
1490 1495 1500 

Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cvs 

1505 1510 TCur 

•^=>-«-" 1515 1520 

Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr 
1525 1530 1535 

Val Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin 
1540 1545 155Q 

Asp His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His He 
1555 1560 1565 

Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Phe Pro 
1570 1575 1580 

Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro 

1590 1595 1600 

Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro 
l^°5 1610 igi5 

Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin 

1625 1630 

Asn Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Thr Cys 
1635 1640 1645 

Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Glv 

1655 1660 

Gly val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val 
1665 1670 1675 



Val He Val Gly Arg He Val Leu Ser Gly Lys Pro Ala He He 



1680 



Pro 
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1685 



1690 



1695 



Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ser 
1700 1705 1710 

Gin His Leu Pro Tyr lie Glu Gin Gly Met Met Leu Ala. Glu Gin Phe 
1715 1720 1725 

• 

Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu 
1730 1735 1740 

Val He Thr Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Val Phe 
1745 1750 1755 176( 

Trp Ala Lys His Met Trp Ash Phe He Ser Gly He Gin Tyr Leu Ala 
1765 1770 1775 

Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala 
1780 1785 1790 

Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Gly Gin Thr Leu Leu 
1795 1800 1805 

Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly 
1810 1815 1820 

Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly 
1825 1830 1835 184( 

Ser Val Gly Leu Gly Lys Val Leu Val Asp, He Leu Ala Gly Tyr Gly 
1845 1850 1855 

Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu 
I860 1865 1870 

Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser 
1875 1880 1885 

Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg 
1890 1895 1900 

His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He 
1905 1910 1915 192i 

Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro 
1925 1930 1935 

Glu Ser Asp Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr 



1940 



1945 



1950 



Val Thr Gin Leu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu Cys 
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1955 I960 19S5 

Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He 
^S'^O 1975 1980 

Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met 

1990 1995 2000 

Pro Gin Leu Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Arg 
■ 2005 2010 2015 

Gly Val trp Arg Gly Asp Gly He Met His Thr Arg Cys His Cys Gly 
2020 2025 2030 

Ala Glu He Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly 
2035 2040 2045 

Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala 
2050 2055 2060 

Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Lys Phe 
20" 2070 2075 2080 

Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Arg Val 
2085 2090 2095 

Gly Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp Asn Leu Lys Cys 
2100 2105 2H0 

Pro Cys Gin He Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val 
2115 2120 2125 

Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu 
2130 2135 2140 

Val Ser Phe Arg Val Gly Leu His Glu Tyr Pro Val Gly ser Gin Leu 

2150 2155 2160 

Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr 
2165 2170 2175 

Asp Pro Ser His He Thr Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg 
2180 2185 2190 

Gly Ser Pro Pro Ser Met Ala ser Ser Ser Ala Ser Gin Leu Ser Ala 
2195 2200 2205 

Pro Ser Leu Lys Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala 
2210 2215 2220 

Glu Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn 
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2225 2230 2235 2240 

He Thr Arg Val Glu Ser Glu Asn Lys Val Val lie Leu Asp Ser Phe 
2245 2250 2255 

Asp Pro Leu Val Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala 
2260 2265 2270 

Glu He Leu Arg Lys Ser Arg Arg Phe Ala Arg Ala Leu Pro Val Trp 
2275 2280 2285 

Ala Arg Pro Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro 
2290 2295 2300 

Asp Tyr Glu Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Arg 
2305 2310 2315 2320 

Ser Pro Pro Val Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr 
2325 2330 2335 

Glu Ser Thr Leu Ser Thr Ala Leu Ala Glu Leu Ala Thr Lys Ser Phe 
2340 2345 2350 

Gly Ser Ser Ser Thr Ser Gly He Thr Gly Asp Asn Thr Thr Thr Ser 
2355 2360 2365 

Ser Glu Pro Ala Pro Ser Gly Cys Pro Pro Asp Ser Asp Val Glu Ser 
2370 2375 2380 

Tyr Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu 
2385 2390 2395 2400 

Ser Asp Gly Ser Trp Ser Thr Val Ser Ser Gly Ala Asp Thr Glu Asp 
2405 2410 2415 

Val Val Cys Cys Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr 
2420 2425 2430 

Pro Cys Ala Ala Glu Glu Gin Lys Leu Pro He Asn Ala Leu Ser Asn 
2435 2440 2445 

Ser Leu Leu Arg His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser 
2450 2455 2460 

Ala Cys Gin Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu 
2465 2470 2475 2480 

Asp Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala Ala Ser 
2485 2490 2495 



Lys Val Lys Ala Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr 
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2500 2505 2510 

Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val 
2515 2520 2525 

Arg Cys His Ala Arg Lys Ala Val Ala His lie Asn Ser Val Trp Lys 
2530 2535 2540 

Asp Leu Leu Glu Asp Ser Val Thr Pro lie Asp Thr Thr lie Met Ala 
2545 • 2550 2555 2560 

Lys Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro 
2565 2570 2575 

Ala Arg Leu lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys 
2580 2585 2590 

Met Ala Leu Tyr Asp Val Val Ser Lys Leu Pro Leu Ala Val Met Gly 
2595 2600 2605 

Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu 
2610 2615 2620 

Val Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Ser Tyr Asp 
2625 2630 2635 2640 

Thr Arg Cys Phe Asp Ser Thr Val Thr Glu Ser Asp He Arg Thr Glu 
2645 2650 2655 

Glu Ala He Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala 
2660 2665 2670 

He Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn 
2675 2680 2685 

Ser Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val 
2690 2695 2700 

Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr He Lys Ala Arg 
2705 2710 2715 2720 

Ala Ala Cys Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys 
2725 2730 2735 

Gly Asp Asp Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu Asp 
2740 2745 2750 

Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala 
2755 2760 2765 



Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr 
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2770 2775 2780 

Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg 
2785 2790 2795 2800 

Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala 
2805 2810 2815 

Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn lie 
2820 2825 2830 

lie Met Phe Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His . 
2835 2840 2845 

Phe Phe Ser Val Leu He Ala Arg Asp Gin Leu Glu Gin Ala Leu Asn 
2850 2855 2860 

Cys Glu He Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro 
2865 2870 2875 2880 

Pro He He Gin Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser 
2885 2890 2895 

Tyr Ser Pro Gly Glu He Asn Arg Val Ala Ala Cys Leu Arg Lys Leu 
2900 2905 2910 

Gly Val Pro Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg 
2915 2920 2925 

Ala Arg Leu Leu Ser Arg Gly Gly Arg Ala Ala He Cys Gly Lys Tyr 
2930 2935 . 2940 

Leu Phe Asn Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro He Ala 
2945 2950 2955 2960 

Ala Ala Gly Arg Leu Asp Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser 
2965 2970 2975 

Gly Gly Asp He Tyr His Ser Val Ser His Ala Arg Pro Arg Trp Phe 
2980 2985 2990 

Trp Phe Cys Leu Leu Leu Leu Ala Ala Gly Val Gly He Tyr Leu Leu 
2995 3000 3005 

Pro Asn Arg Glx 
3010 

(2) INFORMATION FOR SEQ ID NO: 3: . 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 38 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GCCAGCCCCC TGATGGGGGC GACACTCCAC CATGAATC 
(2) INFORMATION PGR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 101 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
AATGGTGGCT CCATCTTAGC CCTAGTCACG GCTAGCTGTG AAAGGTCCGT GAGCCGCATG 60 
ACTGCAGAGA GTGCTGATAC TGGCCTCTCT GCTGATCATG T 101 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12980 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 



(iv) ANTI- SENSE: NO 
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GCCAGCCCCC 


TGATGGGGGC 


GACACTCCAC 


CATGAATCAC 


TCCCCTGTGA 


GGAACTACTG 


60 


TCTTCACGCA 


GAAAGCGTCT 


AGCCATGGCG 


TTAGTATGAG 


TGTCGTGCAG 


CCTCCAGGAC 


120 


CCCCCCTCCC 


GGGAGAGCCA 


TAGTGGTCTG 


CGGAACCGGT 


GAGTACACCG 


GAATTGCCAG 


180 


GACGACCGGG 


TCCTTTCTTG 


GATAAACCCG 


CTCAATGCCT 


GGAGATTTGG 


GCGTGCCCCC 


240 


GCAAGACTGC 


TAGCCGAGTA 


GTGTTGGGTC 


GCGAAAGGCC 


TTGTGGTACT 


GCCTGATAGG 


300 


GTGCTTGCGA - 


GTGCCCCGGG 


AGGTCTCGTA 


GACCGTGCAC 


CATGAGCACG 


AATCCTAAAC 


360 


CTCAAAGAAA 


AACCAAACGT 


AACACCAACC 


GTCGCCCACA 


GGACGTCAAG 


TTCCCGGGTG 


420 


GCGGTCAGAT 


CGTTGGTGGA 


GTTTACTTGT 


TGCCGCGCAG 


GGGCCCTAGA 


TTGGGTGTGC 


480- 


GCGCGACGAG 


GAAGACTTCC 


GAGCGGTCGC 


AACCTCGAGG 


TAGACGTCAG 


CCTATCCCCA 


54 0 


AGGCACGTCG 


GCCCGAGGGC 


AGGACCTGGG 


CTCAGCCCGG 


GTACCCTTGG 


CCCCTCTATG 


600 


GCT^TGAGGG 


TTGCGGGTGG 


GCGGGATGGC 


TCCTGTCTCC 


CCGTGGCTCT 


CGGCCTAGCT 


660 


GGGQCCCCAC AGACCCCCGG 


CGTAGGTCGC 


GCAATTTGGG 


TAAGGTCATC 


GATACCCTTA 


720 


CGTGCGGCTT 


CGCCGACCTC 


ATGGGGTACA 


TACC6CTCGT 


CGGCGCCCCT 


CTTGGAGGCG 


780 


CTGCCAGGGC 


CCTGGCGCAT 


GGCGTCCGGG 


TTCT6GAAGA 


CG6CGTGAAC 


TATGCAACAG 


840 


GGAACCTTCC 


TGGTTGCTCT 


TTCTCTATCT 


TCCTTCTGGC 


CCTGCTCTCT 


TGCCTGACCG 


900 


TGCCCGCTTC AGCCTACCAA GTGCGCAATT CCTCGGGGCT 


TTACCATGTC 


ACCAATGATT 


960 


GCCCTAACTC 


GAGTATTGTG 


TACGAGGCGG 


CCGATGCCAT 


CCTGCACACT 


CCGGGGTGTG 


1020 


TCCCTTGCGT 


TCGCGAGGGT 


AACGCCTCGA 


GGTGTTGGGT 


GGCGGTGACC 


CCCACGGTGG 


1080 


CCACCAGGGA 


CGGCAMCTC 


CCCACAACGC 


AGCTTCGACG 


TCATATCGAT 


CTGCTTGTCG 


1140 


GGAGCGCCAC 


CCTCTGCTCG 


GCCCTCTACG 


TGGGGGACCT 


GTGCGGGTCT 


GTCTTTCTTG 


1200 


TTGGTCAACT 


GTTTACCTTC 


TCTCCCAGGC 


GCCACTGGAC 


GACGCAAGAC 


TGCAATTGTT 


1260 


CTATCTATCC 


CGGCCATATA ACGGGTCATC GCATGGCATG GGATATGATG 


ATGAACTGGT 


1320 


CCCCTACGGC AGCGTTGGTG 


GTAGCTCAGC 


TGCTCCGGAT 


CCCAC7VAGCC 


ATCATQGACA 


1380 


TGATCGCTGG 


T6CTCACTGG 


GGAGTCCTGG 


CGGGCATAGC 


GTATTTCTCC 


ATGGTGGGGA 


1440 
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ACTGGGCGAA GGTCCTGGTA GTGCTGCTGC TATTTGCCGG CGTCGACGCG GAAACCCACG 1500 

TCACCGGGGG AAGTGCCGGC CGCACCACGG CTGGGCTTGT TGGTCTCCTT ACACCAGGCG 1560 

CCAAGCAGAA CATCCAACTG ATCAACACCA ACGGCAGTTG GCACATCAAT AGCACGGCCT 1620 

TGAACTGCAA TGAAAGCCTT AACACCGGCT GGTTAGCAGG GCTCTTCTAT CAGCACAAAT 1680 

TCAACTCTTC AGGCTGTCCT GAGAGGTTGG CCAGCTGCCG ACGCCTTACC GATTTTGCCC 1740 

AGGGCTGGGG TCCTATCA^T TATGCCAACG GAAGCGGCCT CGACGAACGC CCCTACTGCT 1800 

GGCACTACCC TCCAAGACCT TGTGGCATTG TGCCCGCAAA GAGCGTGTGT GGCCCGGTAT 1860 

ATTGCTTCAC TCCCAGCCCC GTGGTGGTGG GAACGACCGA CAGGTCGGGC GCGCCTACCT 1920 

ACAGCTGGGG TGCAAATGAT ACGGATGTCT TCGTCCTTT^ CAACACCAGG CCACCGCTGG 1980 

GCAATTGGTT CGGTTGTACC TGGATGAACT CAACTGGATT CACCAAAGTG TGCGGAGCGC 2040 

CCCCTTGTGT CATCGGAGGG GTGGGCAACA ACACCTTGCT CTGCCCCACT GATTGTTTCC 2100 

GCAAGCATCC GGAAGCCACA TACTCTCGGT GCGGCTCCGG TCCCTGGATT ACACCCAGGT 2160 

GCATGGTCGA CTACCCGTAT AGGCTTTGGC ACTATCCTTG TACCATCAAT TACACCATAT 2220 

TCAAAGTCAG GATGTACGTG GGAGGGGTCG AGCACAGGCT GGAAGCGGCC TGCAACTGGA 2280 

CGCGGGGCGA ACGCTGTGAT CTGGAAGACA GGGACAGGTC CGAGCTCAGC CCATTGCTGC 2340 

TGTCCACCAC ACAGTGGCAG GTCCTTCCGT GTTCTTTCAC GACCCTGCCA GCCTTGTCCA 2400 

CCGGCCTCAT CCACCTCCAC CAGAACATTG TGGACGTGCA GTACTTGTAC GGGGTAGGGT 2460 

CAAGCATCGC GTCCTGGGCC ATTAAGTGGG AGTACGTCGT TCTCCTGTTC CTCCTGCTT6 2520 

CAGACGCGCG CGTCTGCTCC TGCTTGTGGA TGATGTTACT CATATCCCAA GCGGAGGCGG 2580 

CTTTGGAGAA CCTCGTAATA CTCAATGCAG CATCCCTGGC COGGACGCAC GGTCTTGTGT 2640 

CCTTCCTCGT GTTCTTCTGC TTTGCGTGGT ATCTGAAGGG TAGGTG6GTG CCCGGAGCGG 2700 

TCTACGCCTT . CTACGGGATG TGGCCTCTCC TCCTGCTCCT GCTGGCX3TTG CCTCAGCGGG 2760 

CATACGCACT GGACACGGAG GTGGCCGCGT CGTGTGGCGG CGTTGTTCTT GTCGGGTTAA 2820 

TGGCGCTGAC TCTGTCGCCTl TATTACAAGC GCTACATCAG CTGGTGCATG TGGTGGCTTC 2880 

AGTATTTTCT GACCA6AGTA GAAGCGCAAC TGCACGTGTG GGTTCCCCCC CTCAACGTCC 2940 

GGGGGGGGCG CGATGCCGTC ATCTTACTCA TGTGT6TTGT ACACCCQACT CTGGTATTTG 3000 




wo 98/39031 



PCTAJS98/04428 



119 



ACATCACCAA ACTACTCCTG GCCATCTTCG GACCCCTTTG GATTCTTCAA GCCAGTTTGC 
TTAAAGTCCC CTACTTCGTG CGCGTTCAAG GCCTTCTCCG GATCTGCGCG CTAGCGCGGA 
AGATAGCCGG AGGTCATTAC GTGCAAATGG CCATCATCAA GTTAGGGGCG CTTACTGGCA 
CCTATGTGTA TAACCATCTC ACCCCTCTTC GAGACTGGGC GCACAACGGC CTGCGA6ATC 
TGGCCGTGGC TGTG6AACCA GTCGTCTTCT CCCGTVATGGA GACCAAGCTC ATCACGTGGG 

gggcagatXc cgccgcgtgc ggtgacatca tcaacggctt gcccgtctct gcccgtaggg 

GCCAGGAGAT ACTGCTTGGG CCAGCCGACG GAATGGTCTC CAAGGGGTGG AGGTTGCTGG 
CGCCCATCAC GGCGTACGCC CAGCAGACGA GAGGCCTCCT AGGGTGTATA ATCACCAGCC 
TGACTGGCCG GGACAAAAAC CAAGTGGAGG GTGAGGTCCA GATCGTGTCA ACTGCTACCC 
AAACCTTCCT GGCT^CGTGC ATCAATGGGG TATGCTGGAC TGTCTACCAC GGGGCCGGAA 
CGAGGACCAT CGCATCACCC AAGGGTCCTG TCATCCAGAT GTATACCAAT GTGGACCAAG 
ACCTTGTGGG CTGGCCC6CT CCTCAAGGTT CCCGCTCATT GACACCCTGC ACCTGCGGCT 
CCTCGGACCT TTACCTGGTC ACGAGGCACG CCGATGTCAT TCCCGTGCQC CGGCGAGGTG 
ATAGCAGGGG TAGCCTGCTT TCGCCCCGGC CCATTTCCTA CTTGAAAGGC TCCTCGGGGG 
GTCCGCTGTT GTGCCCOGCG GGACACGCCQ TGG6CCTATT CAGGGCCGCG GTGTGCTICCC 
GTGGAGTGGC TAAGGCQGTG GACTTTATCC CTGTGGAGAA CCTAGAGACA ACCATGAGAT 
CCCCGGTGTT CACGGACAAC TCCTCTCCAC CAGCAGTGCC CCAGAGCTTC CAGGTGGCCC 
ACCTGCATGC TCCCACCGGC AGCGGT7UVGA GCACCAAGGT CCCGGCTGCG TACGCA6CCC 
AGGGCTACAA GGTGTTGGTG CTCAACCCCT CTGTTGCTGC AACGCTGGGC TTTGGTGCTT 
ACATGTCCAA GGCCCATGGG GTTGATCCTA ATATCAGGAC CGGGGTGAGA ACAATTACCA 
CTGGCAGCCC CATCACGTAC TCCACCTACG GCAAGTTCCT TGCCGACGGC GGGTGCTCAG 
GAGGT6CTTA TGACATTATA ATTTGTGACG AGTGCCACTC CACGGATGCC ACATCCATCT 
TGG6CATCGG CACTGTCCTT GACCAAGCAG AGACTGCGGG GGCGAGACTG GTTGTGCTCG 
CCACTGCTAC CCCTCCGGGC TCCGTCACTG TGTCCCATCC TAACATCGAG GAGGTTGCTC 
TGTCCACCAC CGGAGAGATC CCCTTTTACG GCAAGGCTAT CCCCCTCGAG GTGATCAAGG 
GGGGAAGACA TCTCATCTTC TGCCACTCAA AGAAGAAGTG CGACGAGCTC GCCGCGAAGC 



3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
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TGGTCGCATT GGGCATCAAT GCCGTGGCCT ACTACCGCGG TCTTGACGTG TCTGTCATCC 4620 

CGACCAGCGG CGATGTTGTC GTCGTGTCGA CCGATGCTCT CATGACTGGC TTTACCGGCG 4680 

ACTTCGACTC TGTGATAGAC TGCAACACGT GTGTCACTCA GACAGTCGAT TTCAGCCTTG 4740 

ACCCTACCTT TACCATTGAG ACAACCACGC TCCCCCAGGA TGCTGTCTCC AGGACTCAAC 4800 

GCCGGGGCAG GACTGGCAGG GGGAAGCCAG GCATCTACAG ATTTGTG6CA CCGGGGGAGC 4860 

GCCCCTCCGG CATGTTCGAC TCGTCCGTCC TCTGTGAGTG CTATGACGCG GGCTGTGCTT 4920 

GGTATGAGCT CACGCCCGCC GAGACTACAG TTAGGCTACG AGCGTACATG AACACCCCGG 4980 

GGCTTCCCGT GTGCCAGGAC CATCTTGAAT TTTGGGAGGG CGTCTTTACG GGCCTCACTC 5040 

ATATAGATGC CCACTTTCTA TCCCAGACAA AGCAGAGTGG GGAGAACTTT CCTTACCTGG 5100 

TAGCGTACCA AGCCACCGTG TGCGCTAGGG CTCAAGCCCC TCCCCCATCG T6GGACCAGA 5160 

TGTGGAAGTG TTTGATCCGC CTTAAACCCA CCCTCCATGG GCCAACACCC CTGCTATACA 5220 

GACTGGGCX3C TGTTCAGAAT GAAGTCACCC TGACGCACCC AATCACCAAA TACATCATGA 5280 

CATGCATGTC GGCCGACCTG GAGGTCGTCA CGAGCACCTG GGTGCTCGTT GGCGGCGTCC 5340 

TGGCTGCTCT GGCCGCGTAT TGCCTGTCAA CAGGCTGCGT GGTCATAGTG GGCAGGATTG 5400 

TCTTGTCCGG GAAGCCGGCA ATTATACCTG ACAGGGAGGT TCTCTACCAG GAGTTCGATG 5460 

AGATGGAAGA GTGCTCTCAG CACTTACCXyr ACATCGAGCA AGGGATGATG CTCGCTGAGC 5520 

AGTTCAAGCA GAAGGCCCTC GGCCTCCTGC AGACCX3CGTC CCGCCAAGCA GAGGTTATCA 5580 

CCCCTGCTGT CCAGACCAAC TGGCAGAAAC TCGAGGTCTT CTGGGCGAAG CACATGTGGA 5640 

ATTTCATCAG TGGGATACAA TACTTGGCGG GCCTGTCAAC GCTGCCTGGT AACCCCGCCA 5700 

TTGCTTCATT GATGGCTTTT ACAGCTGCCG TCACCAGCCC ACTAACCACT GGCCAAACCC 5760 

TCCTCTTCAA CATATTGGGG GGGTGGGTGG CTGCCCAGCT CXSCCGCCCCC GGTGCCGCTA 5820 

CCGCCTTTGT GGGCGCTGGC TTAGCTGGCfG CCGCCA'TCGG CAGCGTTGGA CTGGGGAAGG 5880 

TCCTCGTGGA CATTCTTGCA GGGTATGGCG CGGGCGTGGC GGGAGCTCTT GTAGCCTTCA 5940 

AGATCATGAG CGGTGAGGTC CCCTCCACGG AGGACCTGGT CAATCTGCTG CCCGCCATCC 6000 

TCTCGCCTGG AGCCCTTGTA GTCGGTGTGG TCTGCGCAGC AATACTGCGC CGGCACGTTG 6060 

GCCCGGGCGA GGGGGCAGTG CAATGGATGA ACC6GCTAAT AGCCTTCX3CC TCCCGGGGGA 6120 
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ACCATGTTTC CCCCACGCAC TACGTGCCGG AGAGCGATGC 


AGCCGCCCGC 


GTCACTGCCA 


6180 


TACTCAGCAG CCTCACTGTA ACCCAGCTCC TGAGGCGACT 


GCATCAGTGG 


ATAAGCTCGG 


6240 


AGTGTACCAC TCCATGCTCC GGTTCCTGGC TAAGGGACAT 


CTGGGACTGG 


ATATGCGAGG 


6300 


TGCTGAGCGA CTTTAAGACC TGGCTGAAAG CCAAGCTCAT 


GCCACAACTG 


CCTGGGATTC 


6360 


CCTTTGTGTC CTGCCAGCGC GGGTATAGGQ GGGTCTGGCG 


AGGAGACGGC 


ATTATGCACA 


6420 


CTCGCTGCCA CTGTGGAGCT GAGATCACTG GACATGTCAA 


AAACGGGACG 


ATGAGGATCG 


6480 


TCGGTCCTAG GACCTGCAGG AACATGTGGA GTGGGACGTT CCCCATTAAC GCCTACACCA 


6540 


CGGGCCCCTQ TACTCC CCTT t*^ i \ u v»A i a i >wiv3 1 1 




AGGGTGTCTG 


6600 


CAGAGGAATA CGTGGAGATA Pikt\s\JM\3 k\M Vs^jIjav- i aVi-ua 




GGTATGACTA 


6660 


CTGACAATCT TAAATGCCCG TGCCAGATCC CATCGCCCGA 


ATTTTTCACA 


GAATTGGACG 


6720 


GGGTGCGCCT ACATAGGTTT GCGCCCCCTT GCAAGCCCTT 


GCTGCGGGAG 


GAGGTATCAT 


6780 


TCAGAGTAGG ACTCCACGAG TACCCGOTGQ GGTCGCAATT 


ACCTTGCGAG 


CCCGAACCGG 


6840 


AC6TAGCC6T GTTGACGTCC ATGCTCACTG ATCCCTCCCA 


TATAACAGCA 


GAGGCGGCCG 


6900 


GGAGAAGGTT GGCGAGAGGG TCACCCCCTT CTATGGCCAG 


CTCCTCGGCC 


A6CCAGCTGT 


6960 


CCGCTCCATC TCTCAAGGCA ACTTGCACCG CCAACCATGA 


CTCCCCTGAC 


GCCGAGCTCA 


7020 


TA6AGGCTAA CCTCCTGTGG AGGCAGGAGA TGGGCGGCAA 


CATCACCAGG 


GTTGAGTCAG 


7080 


AGAACAAAGT GGTGATTCTG GACTCCTTCG ATCCX3CTTGT 


GGCAGAGGAG 


GATGAGCGGG 


7140 


AGGTCTCCGT ACCCGCAGAA ATTCTGCGGA AGTCTCGGAG 


ATTCGCCCGG 


GCCCTGCCCG 


7200 


TTTGGGCGCG GCCGGACTAC AACCCCCCGC TAGTAGAGAC 


GTGGAAAAAG 


CCTGACTACG 


7260 


AACCACCTGT GGTCCATGGC TGCCCGCTAC CACCTCCACG 


GTCCCCTCCT 


GTGCCTCCGC 


7320 


CTCGGAAAAA GCGTACGGTQ GTCCTCACCG AATCAACCCT ATCTACTGCC 


TTGGCCGAGC 


7380 


TTGCCACCAA AA6TTTTGGC AGCTCCTCAA CTTCCGGCAT TACGGGCGAC AATACGACAA 


7440 


CATCCTCTGA GCCCGCCCCT TCTGGCTGCC CCCCCGACTC 


CGACGTTGAG 


TCCTATTCTT 


7500 


CCATGCCCCC CCTGGA6GGG GAGCCTGGGG ATCCGGATCT 


CAGCGACGGG 


TCATGGTCGA 


7560 


CGGTQAGTAG TGGGGCCGAC ACGGAAGATG TCGTGTGCTG 


CTCAATGTCT 


TATTCCTGGA 


7620 


CAGGCGCACT CGTCACCCCG TGCGCTGCGG AAQAACAAAA 


ACTGCCCATC 


AACGCACTGA 


7680 
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GCAACTCGTT GCTACGCCAT CACAATCTGG TGTATTCCAC CACTTCACGC AGTGCTTGCC 7740 _ 

AT^GGCAGAA GAAAGTCACA TTTGACAGAC TGCAAGTTCT GGACAGCCAT TACCAGGACG 7800 

TGCTCAAGGA GGTCAAAGCA GCGGCGTCAA AAGTGAAGGC TAACTTGCTA TCCGTAGAGG 7860 

AAGCTTGCAG CCTGACGCCC CCACATTCAG CCT^LAATCCAA GTTTGGCTAT GGGGCAAAAG 792 0 

ACGTCCGTTG CCATGCCAGA AAGGCCGTAG CCCACATCAA CTCCGTGTGG AAAGACCTTC 7980 

TGGAAGACAG TGTAACACCA ATAGACACTA CCATCATGGC CAAGT^CGAG GTTTTCTGCG 8040 

TTCAGCCTGA OAAGOGGGGT CGTAAGCCAG CTCGTCTCAT C6TGTTCCCC GACCTGGGCG 8100 

TGCGCGTGTG CGAGAAGATG GCCCTGTACG ACGTGGTTAG CTVAGCTCCCC CTGGCCGTGA 8160 

TGGGAAGCTC CTACGGATTC CAATACTCAC CAGGACAGCG GGTTGT^TTC CTCGTGCAAG 8220 

CGTGGAAGTC CAAGAAGACC CCGATGGGGT TCTCGTATGA TACCCGCTGT TTTGACTCCA 8280 

CAGTCACTGA GAGCGACATC CGTACGGAGG AGGCAATTTA CCAATGTTGT GACCTGGACC 8340 

CCCAAGCCCG CGTGGCCATC AAGTCCCTCA CTGAGAGQCT TTATGTTGGG GGCCCTCTTA 8400 

CCAATTCAAG GGGGGAAAAC TGCGGCTACC GCAGGTGCCG CGCGAGCGGC GTACTGACAA 8460 

CTAGCTGTGG TAACACCCTC ACTTGCTACA TCAAGGCCCG GGCAGCCTGT CGAGCCGCAG 8520 

GGCTCCAGGA CTGCACCATG CTCGTGTGTG GCGACGACTT A6TCGTTATC TGTGAAAGTG 8580 

CGGGGGTCCA GGAGGACGCG GCGAGCCTGA GAGCCTTCAC GGAGGCTATG ACCAGGTACT 8640 

CCGCCCCCCC CGGGGACCCC CCACAACCAG AATACGACTT GGAGCTTATA ACATCATGCT 8700 

CCTCCAACGT GTCAGTCGCC CACGACGGCG CTGGAAAGAG GGTCTACTAC CTTACCCGTG 8 760 

ACCCTACAAC CCCCCTCGCG AGAGCCGCGT GGGAGACAGC AAGACACACT CCAGTCAATT 8820 

CCTQGCTAGG CAACATAATC ATGTTTGCCC CCACACTGTG GGCGAGGATG ATACTGATGA 8880 

CCCATTTCTT TAGCX3TCCTC ATAGCCAGGG ATCAGCTTGA ACAGGCTCTT AACTGTGAGA 8940 

TCTACGGAGC CTGCTACTCC ATAGAACCAC TGGATCTACC TCCAATCATT CAAAGACTCC 9000 

ATGGCCTCAG CGCATTTTCA CTCCACAGTT ACTCTCCAGG TGAAATCAAT AGGGTGGCCG 9060 

CATGCCTCAG AAAACTTGGG GTCCCGCCCT TGCGAGCTTG GAGACACCGG GCCCGGAGCG 9120 

TCCGCGCTAG GCTTCTGTCC AGAGGAGGCA GGGCTGCCAT ATGTGGC^AG TACCTCTTCA 9180 

ACTGGGCAGT AAGAACAAAG CTCAAACTCA CTCCAATAGC GGCCGCTGGC CGGCTGGACT 9240 
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TGTCCGGTTG GTTCACGGCT 


GGCTACAGCG GGGGAGACAT 


TTATCACAGC 


GTGTCTCATG 


9300 


CCCGGCCCCG CTGGTTCTGG 


TTTTGCCTAC TCCTGCTCGC 


TGCAGGGGTA 


GGCATCTACC 


9360 


TCCTCCCCAA CCGATGAAGG 


TTGGGGTAAA CACTCCGGCC 


TCTTAGGCCA 


TTTCCTGTTT 


9420 


TTTTTTTTTT tTTTTTTTTT 


TTTTTTTTTT TTTTTTTTTT 


TTTTTTTTTT 


CTTTTTTTTT 


9480 


TTTTTTTTCC ' TTTTTTTTTT 


TTTTTTTTTT CTTTCCTTCT 


TTTTTCCTTT 


CTTTTCCTTC 


9540 


CTTCTTTAAT GGTGGCTCCA 


TCTTAGCCCT AGTCACGGCT 


AGCTGTGAAA 


GGTCCGTGAG 


9600 


CCGCATGACT GCAGAGAGTG 


CTGATACTGG CCTCTCTGCA 


GATCATGTCG 


CATTCACGCG 


9660 


TTCGAATTAA TTAACTAGTG 


GGAATACGCG GGGTATGCCG 


CGTTTTAGCA 


TATTGACGAC 


9720 


CCAATTCTCA TGTTTGACAG 


CTTATCATCG ATAAGCTTTA 


ATGCGGTAGT 


TTATCACAGT 


9780 


TAJATTGCTA ACGCAGTCAG GCACCGTGTA TGAAATCTAA 


CAATGCGCTC 


ATCGTCATCC 


9840 


TCGGCACCGT CACCCTGGAT 


GCTGTAGGCA TAGGCTTGGT 


TATGCCGGTA 


CTGCCGGGCC 


9900 


TCTTGCGGGA TATCGTCCAT TCCGACAGCA TCGCCAGTCA 


CTATGGCGTG 


CTGCTAGCGC 


9960 


TATATGCGTT GATGCAATTT 


CTATGCGCAC CCGTTCTCGG 


AGCACTGTCC 


GACCGCTTTG 


X0020 


GCCGCCGCCC AGTCCTGCTC 


GCTTCGCTAC TTGGAGCCAC 


TATCGACTAC 


GCGATCATGG 


10080 


CGACCACACC CGTCCTGTGG ATCCTCTACG CCGGACGCAT 


CGTGGCCGGC 


ATCACCGGCG 


10140 


CCACAGGTGC .GGTTGCTGGC GCCTATATCG CCGACATCAC 


CGATGGGGAA 


GATCGGGCTC 


10200 


GCCACTTCGG GCTCATGAGC GCTTGTTTCG 6CGTGGGTAT 


GGTGGCAGGC 


CCCGTGGCCG 


10260 


GGGGACTGTT GGGCGCCATC TCCTTGCATG CACCATTCCT 


TGCGGCGGCG 


GTGCTCAACG 


10320 


GCCTCAACCT ACTACTGOGC TGCTTCCTAA TGCAGGAGTC 


GCATAAGGGA 


GAGCGTCGAC 


10380 


CGATGCCCTT GAGAGCCTTC AACCCAGTCA 6CTCCTTCCG 


GTGGGCGCGG 


GGCATGACTA 


10440 


TCGTCGCCOC ACTTATGACT GTCTTCTTTA TCATGCAACT 


CGTAGGACAG 


GTGCCGGCAG 


10500 


CGCTCTGGGT CATTTTCGGC 


GAGGACCGCT TTCGCTGGAG 


CGCGACGATG 


ATCGGCCTQT 


10560 


CGCTTGCGGT ATTCGGAATC 


TTGCACGCCC TCGCTCAAGC 


CTTCGTCACT 


GGTCCCX5CCA 


10620 


CCAAACGTTT CGGCGAGAAG CA6GCCATTA TCGCCGOCAT 


6GC6GCCGAC 


GCGCTGGGCT 


10680 


ACGTCTTGCT GGCGTTCGCX3 ACGCGAGGCT GGATGGCCTT 


CCCCATTATG ATTCTTCTCG 


10740 


CTTCCGGCGG CATCfGGOATG 


\ CCCGCGTTGC AGGCCAT6CT 


GTCCAGGCAG 


GTAGATGACG 


10800 
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ACCATCAGGG ACAGCTTCAA GGATCGCTCG CGGCTCTTAC CAGCCTAACT TCGATCACTG 10860 _ 

GACCGCTGAT CGTCACGGCG ATTTATGCCG CCTCGGCGAG CACATGGAAC GGGTTGGCAT 10920 

GGATTGTAGG CGCCGCCCTA TACCTTGTCT GCCTCCCCGC GTTGCGTCGC GGTGCATGGA 10980 

GCCGGGCCAC CTCGACCTGA ATGGAAGCCG GCGGCACCTC GCTAACGGAT TCACCACTCC 11040 

AAGAATTGGA GCCAATCAAT TCTTCCGGAG AACTGTG7VAT GCGCAAACCA ACCCTTGGCA 11100 

GAACATATCC ATCGCGTCCG CCATCTCCAG CAGCCGCACG CGGCGCATCT CGGG(ZAGCGT 11160 

TGGGTCCTGG CCACGGGTGC GCATGATCGT GCTCCTGTCG TTGAGGACCC GGCTA6GCTG 11220 

GCGGGGTTGC CTTACTGGTT AGCAGAATGA ATCACCGATA CGCGAGCGAA CGTGAAGCGA 11280 

CTGCTGCTGC AAAACGTCTG CGACCTGAGC AACAACATGA ATGGTCTTCG GTTTCCGTGT 11340 

TTCGTAAAGT CTGGAAACGC GGAAGTCAGC GCCCTGCACC ATTATGTTCC GGATCTGCAT 11400 

CGCAGGATGC TGCTGGCTAC CCTGTGGAAC ACCTACATCT GTATTAACGA AGCGCTGGCA 11460 

TTGACCCTGA GTGATTTTTC TCTGGTCCCG CCGCATCCAT ACCGCCAGTT GTTTACCCTC 11520 

ACAACGTTCC AGTAACCGGG CATGTTCATC ATCAGTAACC CGTATCGTGA GCATCCTCTC 11580 

TCGTTTCATC GGTATCATTA CCCCCATGAA CAGAAATTCC CCCTTACACG GAGGC7VTCAA 11640 

GTGACCAAAC AGGAAAAAAC CGCCCTTAAC ATGGCCCGCT TTATCAGAAG CCAGACATTA 11700 

ACGCTTCTGG AGAAACTCAA CGAGCTGGAC GCGGATGAAC AGGCAGACAT CTGTGAATCG 11760 

CTTCACGACC ACGCTGATGA GCTTTACCGC AGCTGCCTCG CGCGTTTCGG TGATGACGGT 11820 

GAAAACCTCT GACACATGCA GCTCCCGGAG ACGGTCACAG CTTGTCTGTA AGCGGATGCC 11880 

GGGAGCAGAC AAGCCCGTCA GGGCGCGTCA GCGGGTGTTG GCGGGTGTCG GGGCGCA6CC 11940 

ATGACCCAGT CA0GTAGCX3A TAGCGGAGTG TATACTGGCT TAACTATGCG GCATCAGA6C 12000 

AGATTGTACT GAGAGTGCAC CATATGCGGT GTGAAATACC GCACAGATGC GTAAGGAGAA 12060 

AATACCGCAT CAGGCGCTCT TCCGCTTCCT CGCTCACTGA CTCGCTGCGC TCGGTCGTTC 12120 

GGCTGCGGCG AGCGGTATCA GCTCACTCAA AG6CGGTAAT ACGGTTATCC ACAGAATCAG 12180 

GGGATAACGC AGGAAAGAAC ATQTGAGCAA AAGGCCAGCA AAAGGCCAGG AACCGTAAAA 12240 

AGGCCGCGTT GCTGGCGTTT TTCCATAGGC TCCGCCCCCC TGACGAGCAT CACAAAAATC 12300 

GACGCTCAAG TCTIGAGGTGG CGAAACCCGA CAGGACTATA AAGATACCT^G GCGTTTCCCC 12360 
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CTGGAAGCTC CCTCGTGCGC TCTCCTGTTC CGACCCTGCC GCTTACCGGA 
CCTTTCTCCC TTCGGGAAGC GTGGCGCTTT CTCATAGCTC ACGCTGTAGG 
CGGTGTAGGT CGTTCGCTCC AAGCTGGGCT GTGTGCACGA ACCCCCCGTT 
GCTGCGCCTT ATCCGGTAAC TATCGTCTTG AGTCCAACCC GGTAAGACAC 
CACTGGCAGC AGCCACTGGT AACAGGATTA GCAGAGCGAG GTATGTAGGC 
AGTTCTTGAA GTGGTGGCCT AACTACGGCT ACACTAGAAG GACAGTATTT 
CTCTGCTGAA GCCAGTTACC TTCGGT^AAAA GAGTT6GTAG CTCTTGATCC 
CCACCGCTGG TAGCGGTGGT TTTTTTGTTT GCAAGCAGCA GATTACGCGC 
GATCTCAAGA AGATCCTTTG ATCTTTTCTA CGGGGTCTGA CGCTCAGTGG 
CACGTTAAGG GATTTTGGTC ATGAGATTAT CAAAAAGGAT CTTCACCTAG 
AGATAATACG ACTCACTATA 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



TACCTGTCCG 
TATCTCAGTT 
CAGCCCGACC 
GACTTATCGC 
GGTGCTACAG 
GGTATCTGCG 
GGCAAACAAA 
AGAAAAAAAG 
AACGAAAACT 
ATCCTTTTCT 



12420 
12480 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
12980 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GGCGACACTC CACCATAGAT C 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLiECULE TYPE: DNA (genomic) 



21 
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(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TGGCACTACC CTCCAAGACC 20 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
ATGACACAAG GGGGCGCTCC GCACACT 27 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

TCCTGCTTGT GGATGATG 18 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 16 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 10 
TAGTTTGGTG ATGTCA 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 17 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 
ACATAGGTGC CAGTAAG 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(xi) SEQXrearCE DESCRIPTION: SEQ ID NO: 12 
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CTGGCAACGT GCTITCA 



16 . 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GGGTGAGAAC AATTACCA 18 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 16 base pairs 
(B> TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
ATTGATGCCC AATGCG 16 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQXre:NCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 15 
ACTGCCTGGG ATTCCCT 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQXJENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 
CCACAGTGGC AGCGAGTG 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: doilble 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 
CATG6ACGTC AACACG 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: doxible 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
AATCTTCACC GGTTGGGGAG GAGGTAGATG 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9416 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : doxible 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



\GCCAGCCpCC TGATGGGGGC GACACTCCAC CATAGATCAC TCCCCTGTGA GGJ\ACTACTG 
TCTTCACGCA GAAAGCGTCT AGCCATGGCG TTAGTATGAG TGTCGTGCAG CCTCCAGGAC 
CCCCCCTCCC GGGAGAGCCA TAGTGGTCTG CGGAACCGGT GAGTACACCG GAATTGCCAG 
GACGACCGGG TCCTTTCTTG GATAAACCCG CTCAATGCCT GGAGATTTGQ GCGTGCCCCC 
GCAAGACTGC TAGCCGAGTA GTGTTGGGTC GCGAAAGGCC TTGTGGTACT GCCTGATAGG 
GTGCTTQCGA GTGCCCCGGG AGGTCTCGTA GACCGTGCAC CATGAGCACG AATCCTAAAC 
CTCAAAGAAA AACCAAACGT AACACCAACC GTCGCCCACA GGACGTCGAG TTCCCGGGTG 
GCGGTCAGAT CGTTGGTGGA GTTTACTTGT TGCCGCGCAG GGGCCCTAGA TTGGGTGTGC 
GCGCGACGAG GAAGACTTCC GAGCGGTCGC AACCTCGTGG TAGACGTCAG CCTATCCCCA 
AGGCACGTCG GCCCGAGGGC AGGACCTGGG CTCAGCCCGG GTACCCTTGG CCCCTCTATG 
GCAATGAGGG TTGCGOGTGG GCGGGATGGC TCCTGTCTCC CCGTGGCTCT CGGCCTAGCT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
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GGGGCCCCAC AGACCCCCGG CGTAGGTCGC GCAATTTGGG TAAGGTCATC GATACCCTTA 720 

CX3TGCGGCTT CGCCGACCTC ATGGGGTACA TACCGCTCGT CGGCGCCCCT CTTG6AGGCG 780 

CTGCCAGGGC CCTGGCGCAT GGCGTCCGGG TTCTGGAAGA CGGCGTGAAC TATGCAACAG 840 

GGAACCTTCC TGGTTGCTCT TTCTCTATCT TCCTTCTGGC CCTGCTCTCT TGCCTGACTG 900 

TGCCCGCTTC AGCCTACCAA GTGCGCAATT CCTCGGGGCT TTACCATGTC ACCAATGATT 960 

GCCCTAATTC GAGTATTGTG TACGAGGCGG CCGATGCCAT CCTGCACACT CCGGGGTGTG 1020 

TCCCTTGCGT TCGCGAGGGT J^ACGCCTCGA GGTGTTGGGT GGCGGTGACC CCCACGGTGG 1080 

CCACCAGGGA CGGCAAACTC CCCACAACGC A6CTTCGACG TCATATCGAT CTGCTTGTCG 1140 

GGAGCGCCAC ■ CCTCTGCTCA GCCCTCTACG TGGGGGACCT GTGCGGGTCT GTTTTTCTTG 1200 

TTGGTCAACT GTTTACCTTC TCTCCCAGGC GCCACTGGAC GACGCAAAGC TGCAATTGTT 1260 

CTATCTATCC CGGCCATATA ACGGGTCATC GCATGGCATG GGATATGATG ATGAACTGGT 1320 

CCCCTACGGC AGCGTTGGTG GTAGCTCAGC TGCTCCG6AT CCCT^CAAGCC ATCATGGACA 1380 

TGATCGCTGG TGCTCACTGG GGAGTCCTGG C6GGCATAGC GTATTTCTCC ATGGTG6GGA 1440 

ACTGGGCGAA GGTCCTGGTA GTGCTGCTGC TATTTGCCGG CGTCGACGCG GAAACCCACG 1500 

TCACCGGGGG AAGTGCCGGC CACACCACGG CTGGGCTTGT TGGTCTCCTT ACACCAGGCG 1560 

CCAAGCAGAA CATCCAACTQ ATCAACACCA ACGGCAGTTG GCACATCAAT AGCACGGCCT 1620 

TGAACTGCAA CGATAGCCTT ACCACCGGCT G6TTAGCAGG GCTCTTCTAT CGCCACAAAT 1680 

TCAACTCTTC AGGCTGTCCT GAGAGGTTGG CCAGCTGCCG ACGCCTTACC 6ATTTTGCCC 1740 

AGGGCTGGGG TCCCATCAGT TATGCCAACG GAAGCGGCCT TGACGAACGC CCCTACTGTT 1800 

GGCACTACCC TCCAAGACCT TGTGGCATTG TGCCCGCAAA GAGCGTGTGT GGCCCGGTAT 1860 

ATTGCTTCAC TCCCAGCCCC GTGGTGGTGG GAACGACCGA CAGGTCGGGC GCGCCTACCT 1920 

ACAGCTGGGG TGCAAATGAT ACGGATGTCT TCGTCCTTAA CAACSLCCAGG CCACCGCTG6 1980 

6CAATTGGTT CGGTTGTACC TGGATGAACT CAACTGGATT CACCAAAGTG TGCGGAGCGC 2040 

CCCCTTGTGT CATCGGAGGG GTGGGCAACA ACACCTTGCT CTGCCCCACT GATTGCTTCC 2100 

GCAAACATCC GGAAGCCACA TACTCTCX3GT GCGGCTCCGG TCCCTGGATT ACACCCAGGT 2160 

GCATGGTCGA CTACCCX3TAT AGGCTTTGGC ACTATCCTTG TACTATCAAT TACACCATAT 2220 
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TCAAAGTCAG GATGTACGTG GGAGGGGTCG AGCA.CAGGCT GGAAGCGGCC TGCAACTGGA 22 BO 

CGCGGGGCGA ACGCTGTGAT CTGGAAGACA GGGACT^GGTC C6AGCTCAGC CCATTGCTGC 2340 

TGTCCACCAC ACAGTGGCAG GTCCTTCCGT GTTCTTTCAC GACCCTGCCA GCCTTGTCCA 2400 

CCGGCCTCAT CCACCTCCAC CAGAACATTG TGGACGTGCA GTACTTGTAC 6GGGTGGGGT 2460 

CAAGCATCGC GTCCTGGGCC ATTAAGTGGG AGTAOGTCGT TCTCCTGTTC CTTCTGCTTG 2520 

CAGACGCGCG CGTCTGCTCC TGCTTGTG6A TGATGTTACT CATATCCCAA GCGGAGGCGG 2580 

CTTTGGAGAA CCTCGTAATA CTCAATGCAG CATCCCTGGC CGGGACGCAC GGTCTTGTGT 2640 

CCTTCCTCGT GTTCTTCTGC TTTGCGTGGT ATCTGAAGGG TAGGTGGGTG CCCGGAGCGG 2700 

TCTACGCCTT CTACGGGATG TGGCCTCTCC TCCTGCTCCT GCTGGCGTTG CCTCAGCGGG 2760 

CATACGCACT GGACACGGAG GTGGCCGCGT CGTGTGGCGG CGTTGTTCTT GTCGQQTTAA 2820 

TGGCGCTGAC TCTGTCACCA TATTACAAGC GCTATATCAG CTGGTGCATG TGGTGGCTTC 2880 

AGTATTTTCT .GACCAGAGTA GAAGCGCAAC TGCACGTGTG GGTTCCCCCC CTCAACGTCC 2940 

GGGGGGGGCG CGATGCCGTC ATCTTACTCA TGTGTGTTGT ACACCCGACT CTGGTATTTG 3000 

ACATCACCAA ACTACTCCTG GCCATCTTCG GACCCCTTTG GATTCTTCAA GCCAGTTTGC 3060 

TTAAAGTCCC CTACTTCGTG CGCGTTCAAG GCCTTCTCCG GATCTGC6CG CTAGCGCX5GA 3120 

AGATAGCCGG AGGTCATTAC GTGCAAATGG CCATCATCAA GTTGGGGGCG CTTACTGGCA 3160 

CCTATGTGTA TAACCATCTC ACCCCTCTTC GAGACTGGGC GCACAACGGC CTGCGAGATC 3240 

TGGCCGTGGC TGTGGAACCA GTCGTCTTCT CCCGAATGGA GACC/AGCTC ATCACGTGGG 3300 

GGGCAGATAC CGCCGCGTGC GGTGACATCA TCAACGGCTT GCCCGTCTCT GCCCGTAGGG 3360 

GCCAGGAGAT ACTGCTTGGA CCAGCCGACG GAATGGTCTC CAAGGGGTGG AGGTTGCTGG 3420 

CGCCCATCAC GGCGTACGCC CA6CAGACGA GAGGCCTCCT AGGGTGTATA ATCACCAGCC 3480 

TGACTGGCCG GGACAAAAAC CAAGTGGAGG GTGAGGTCCA GATCGTGTCA ACTGCTACCC 3540 

AAACCTTCCT GGCAACGTGC ATCAATGGGG TATGCTGGAC TGTCTACCAC GGGGCCGGAA 3600 

CGAGGACCAT CGCATCACCC AAGGGTCCTG TCATCCAGAT GTATACCAAT GTGGACCAAG 3660 

ACCTTGTGGG CTGGCCCGCT CCTCAAGOTT CCCGCTCATT GACACCCTGC ACCTGCGGCT 3720 

CCTCGGACCT TTACCTGGTT ACGAGGCACG CCGACGTCAT TCCCGTGCGC CGGCGAGGTG 3780 
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ATAGCAGGGG TAGCCTGCTT 


TCGCCCCGGC 


CCATTTCCTA 


CCTAAAAGGC 


TCCTCGGGGG 


3840 


GTCCGCTGTT GTGCCCCGCG 


GGACACGCCG 


TGGGCCTATT 


CAGGGCCGCG 


GTGTGCACCC 


3900 


GTGGAGTGAC CAAGGCGGTG 


GACTTTATCC 


CTGTGGAGAA 


CCTAGAGACA 


ACCATGAGAT 


3960 


CCCCGGTGTT CACGGACAAC 


TCCTCTCCAC 


CAGCAGTGCC 


CCAGAGCTTC 


CAGGTGGCCC 


4020 


ACCTGCATGC TCCCACCGGC 


AGTGGTAAGA 


GCACCAAGGT 


CCCGGCTGCG 


TACGCAGCCC 


4080 


AGGGCTACAA GGTGTTGGTG 


CTCAACCCCT .CTGTTGCTGC 


AACGCTGGGC 


TTTGGTGCTT 


4140 


ACATGTCCAA GGCCCATGGG GTCGATCCTA ATATCAGGAC 


CGGGGTGAGA 


ACAATTACCA 


4200 


CTGGCAGCCC CATCACGTAC 


TCCACCTACG 


GCAAGTTCCT 


TGCCGACGGG 


GGGTGCTCAG 


4260 


GAGGCGCTTA TGACATAATA 


ATTTGTGACG 


AGTGCCACTC 


CACGGATGCC 


ACATCCATCT 


4320 


TGG6CATCGG CACTGTCCTT 


GACCAAGCAG 


AGACTGCGGG 


GGCGAGATTG 


GTTGTGCTCG 


4380 


CCACTGCTAC CCCTCCGGGC 


TCCGTCACTG 


TGTCCCATCC 


TAACATCGAG 


GAGGTTGCTC 


4440 


TGTCCACCAC CGGAGAGATC 


CCTTTCTACG 


GCAAGGCTAT 


CCCCCTCGAG 


GTGATCAAGG 


4500 


GGGGAAGACA TCTCATCTTC 


TGTCACTCAA AGAAGAAGTG 


CGACGAGCTC 


GCCGCGAAGC 


4560 


TGGTCGCATT GGGCATCAAT 


GCCGTGGCCT 


ACTACCGCGG 


ACTTGACGTG 


TCTGTCATCC 


4620 


CGACCAACGG CGATGTTGTC 


GTCGTGTCGA 


CCGATGCTCT 


CATGACTGGC 


TTTACCGGCG 


4680 


ACTTCGACTC TGTGATAGAC 


TGCAACACGT 


GTGTCACTCA 


GACAGTCGAT 


TTCAGCCTTG 


4740 


ACCCTACCTT TACCATTGAG 


ACAACCACGC 


TCCCCCAGGA 


TGCTGTCTCC 


AGGACTCAGC 


4800 


GCCGGGGCAG GACTGGCAGG 


GGGAAGCCAG 


GCATCTACAG 


ATTTGTGGCA 


CCGGGGGAGC 


4860 


QCCCCTCCGG CATGTTCGAC 


TCGTCCGTCC 


TCTGTGAGTG 


CTATGACGCG 


GGCTGTGCTT 


4920 


GGTATGAGCT CATGCCCGCC 


GAGACTACAG 


TTAGGCTACG 


AGCGTACATG 


AACACCCCGG 


4980 


GGCTTCCCGT GTGCCAGGAC 


CATCTTGAAT 


TTTGGGAGGG 


CGTCTTTACG 


GGCCTCACCC 


S040 


ATATAGATGC CCACTTTCTA 


TCCCAGACAA AGCAGAGT6G 


GGAGAACTTT 


CCTTACC1X3G 


5100 


TAGCGTACCA AGCCACCGTG 


TGCGCTAGGG 


CTCAAGCCCC 


TCCCCCATCG 


TGGGACCAGA 


5160 


TGTGGAAGTG TTTGATCCGC 


CTTAAACCCA 


CCCTCCATGG 


GCCAACACCC 


CTGCTATACA 


5220 


GACTGGGCGC TGTTCAQAAT 


GAAGTCACCC 


TGACX5CACCC 


AATCACCAAA 


TACATCATGA 


5280 


CATGCATGTC GGCCGACCTG GAGGTCGTCA CGAGCACCTG GGTGCTCGTT 


GGCGGCGTCC 


5340 
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TGGCTGCTCT GGCCGCGTAT TGCCTOTCAA CAGGCTGCGT GGTCATAGTG GGCAGGATTG 5400 

TCTTGTCCGG GAAGCCGGCA ATTATACCTG ACAGGGAGGT TCTCTACCAG GAGTTCGA1X3 5460 

AGATGGAAGA GTGCTCTCAG CACTTACCGT ACATCGAGCA AGGGATGAKS CTCGCTGAGC 5520 

AGTTCAAGCA GAAGGCCCTC GGCCTCCTGC AGACCGCGTC CCGCCATCCA GAGGTTATCA 5580 

CCCCTGCTGT CCAQACCAAC TGGCA6AAAC TCGAGGTCTT CTGGGCGAAG CACATGTGGA 5640 

ATTTCATCAG TSGGATACAA T^VTTTGGCGG GCCTGTCAAC GCTGCCTGGT AACCCCGCCA 5700 

TTGCTTCArr GATGGCrm- ACAGCTGCCG TCACCAGCCC ACTAACCACT GGCCAAACCC 5760 

TCCTCTTCAA CATATTGGGG GGGTGGGTGG CTGCCCAGCT CGCCGCCCCC GGTGCCGCTA 5820 

CCGCCTTTGT GGGCGCTGGC TTAGCT66CG CCGCCATCGG CA6CGTTGGA CTGGGGAAGG 5880 

TCCTCGTGGA CATTCTTGCA GGGTATGGOG CGGGC6TCGC GGGAGCTCTT GTAGCATTCA 5940 

AGATCATGAG CGGTGAGGTC CCCTCCACGG AGGACCTGGT CAATCTGCTO CCCGCCATCC 6000 

TCTCGCCTGG AGCCCTT6TA GTCGGTGTGG TCTGCGCAGC AATACTGCGC CGGCACGTTG 6060 

OCCCGQGCGA G6GGGCAQTG CAATGGATGA ACCGGCTAAT AGCCTTCGCC TCCCGGGGGA 6120 

ACCATGTTTC CCCCACGCAC TACGTGCCGG AGAGCGAT6C AGCCGCCCGC GTCACTGCCA 6180 

TACTCAGCAG CCTCACTCTA ACCCAGCTCC TGAGGCGACT ACATCRGTGO ATAAGCTCGG 6240 

AGTGTACCAC TCCATGCTCC GGCTCCTGGC TAAGGGACAT CltSGGACTCG ATATOCGAGG 6300 

TGCTGAGCGA CTTTAAGACC TGGCTGAAAG CCAAGCTCAT GCCACAACTG CCTGGGATTC 6360 

CCTTTGTGTC CTGCCAGCGC GGGTATAGGG 6GGTCTGGCG AGGAGACGGC ATTATGCACA 6420 

CTCGCTGCCA CTGTGGAGCT GAGATCACTG GACATGTCAA AAACGGGACG ATOAGGATCG 6480 

TCGGTCCTAG GACCTGCAGG AACATCTGQA GTGGGACOTT CCCCATTAAC GCCTACACCA 6540 

CG6GCCCCTC TACTCCCCrr CCTOCGCCQA ACTATAAGTT CGCGCT6TCG AGGGTGTCTG 6600 

CAGAGGAATA CGTGOAGATA AGGCGGGTG6 GGGACTTCCA CTACGTATCG GGTATCACTA 6660 

CTGACAATCT TAAATGCCC6 TOCCAGATCC CATOGCCCGA ATTTTTCACA GAATO^CG 6720 

GGGTGCGCCT ACATAGGTTT GCGCCCCCTT GCAAGCCCTT GCTGCGGGAG GAGGTATCAT 6780 

TCAGAGTAGG ACTCCACGAG TACCCGGTGG G6TC6CAATT ACCTTCCOAG CCCGAACCGQ 6840 

ACGTAGCCGT GTT6ACGTCC ATGCTCACTO ATCCCTCCCA TATAACAGCA GAGGCGGCC6 6900 
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GGAGAAGGTT GGCGAGAGGG TCACCCCCTT CTATGGCCAG CTCCTCGGCC AGCCAGCTGT 6960 

CCGCTCCATC TCTCAAGGCA ACTTGCACCG CCAACCATGA CTCCCCTGAC GCCGAGCTCA 7020 

TAGAGGCTAA CCTCCTGTGG AGGCAGGAGA TGGGCGGCAA CATCACCAGG GTTGAGTCAG 7080 

AGAACT^GT GGTGATTCTG GACTCCTTCG ATCCGCTTGT GGCAGAGGAG GATGAGCGGG 7140 

AGGTCTCCGT ACCCGCAGAA ATTCTGCGGA AGTCTCGGAG ATTCGCCCGG GCCCTGCCCG 7200 

TTTGGGCGCG GCCGGACTAC AACCCCCCGC TAGTAGAGAC GTGGAAAAAG CCTGACTACG 7260 

7JVCCACCTGT GGTCCATGGC TGCCCGCTAC CACCTCCACG GTCCCCTCCT GTGCCTCCGC 7320 

CTCGGAAAAA GCGTACGGTG GTCCTCACCG AATCAACCCT ACCTACTGCC TTGGCCGAGC 7380 

TTGCCACCAA AAGTTTTGGC AGCTCCTCAA CTTCCGGCAT TACGGGCGAC AATATGACAA 7440 

CATCCTCTCA GCCCGCCCCT TCTGGCTGCC CCCCCGACTC CGACGTTGAG TCCTATTCTT 7500 

CCATGCCCCC CCTGGAGGGG GAGCCTGGGG ATCCGGATTT CAGCGACGGG TCATGGTCGA 7560 

CGGTCAGTAG TGGGGCCGAC ACGGAAGATG TCGTGTGCTG CTCAATGTCT TATACCTGGA 7620 

CAGGCGCACT CGTCACCCCG TGCGCTGCGG AAGAACAAAA ACTGCCCATC AACGCACTGA 7680 

GCAACTCGTT GCTACGCCAT CACAATCTGG TATATTCCAC CACTTCACGC AGTGCTTGCC 7740 

AAAGGCAGAA GAAAGTCACA TTTGACAGAC TGCAAGTTCT GGACAGCCAT TACCAGGACG 7800 

TGCTCAAGGA GGTCAAAGCA QCGGCGTCAA AAGTGAAGGC TAACTTGCTA TCCGTAGAGG 7860 

AAGCTTGCAG CCTGACGCCC CCACATTCAG CCAAATCCAA GTTTGGCTAT GGGGCAAAAG 7920 

ACGTCCGTTG CCATGCCAGA AAGGCCGTAG CCCACATCAA CTCCGTGTGG AAAGACCTTC 7980 

TGGAAGACAG TGTAACACCA ATAGACACTA TCATCATGGC CAAGAACGAG GTCTTCTGCG 8040 

TTCAGCCTGA GAAGGGGGGT CGTAAGCCAG CTCGTCTCAT CGTGTTCCCC GACCTGGGCG 8100 

TGCGCX3TGTG CGA6AAGATG GCCCTGTACG ACGTGGTTAG CAAACTCCCC CTGGCCGTGA 8160 

T6GGAAGCTC CTACGGATTC C7VATACTCAC CAGGACAGCG GGTTGAATTC CTCGTGCAAG 8220 

CGTGGAAGTC CAAGAAGACC CCGATGGGGT TCCCGTATGA TACCCGCTGT TTTGACTCCA 8280 

CAGTCACTGA GAGCGACATC CX3TACGGAGG AGGCAATTTA CCAATGTTGT GACCTGGACC 8340 

CCCAAGCCCG CGTGGCCATC AAGTCCCTCA CTGA6AGGCT TTATGTTGGG GGCCCTCTTA 6400 

CCAATTCAAG GGGGG7UUUVC TGCGGCTATC GCAGGTGCCG CGCGAGCGGC GTACTGACAA 8460 
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CTAGCTGTGG 


TAACACCCTC ACTTGCTACA TCAAGGCCCG GGCAGCCCGT CX3AGCCGCAG 


8520 


GGCTCCAGGA 


.CTGCACCATG CTCGTGTGTG GCGACGACTT AGTCGTTATC TGTGT^GTG 


8580 


CGGGGGTCCA 


GGAGGACGCG GCGAGCCTGA GAGCCTTTAC GGAGGCTATG ACCAGGTACT 


8640 


CCGCCCCCCC 


CGGGGACCCC CCACAACCAG AATACGACTT GGAGCTTATA ACATCATGCT 


8700 


CCTCCAACGT 


GTCAGTCGCC CACGACGGCG CTGGAAAAAG GGTCTACTAC CTTACCCGTG 


8760 


ACCCTACAAC 


CCCCCTCGCG AGAGCCX3CGT GGGAGACAGC AAGACACACT CCAGTCAATT 


8820 


CCTGGCTAGG 


CAACATAATC ATGTTTGCCC CCACACTGTG GGCGAGGATG ATACTGATGA 


8680 


CCCATTTCTT 


TAGCGTCCTC ATAGCCAGGG ATCAGCTTGA ACAGGCTCTT AACTGTGAGA 


8940 


TCTACGCAGC 


CTGCTACTCC ATAGAACCAC TGGATCTACC TCCAATCATT CAAAGACTCC 


9000 


ATGGCCTCAG 


CGCATTTTTA CTCCACAGTT ACTCTCCAGG TGAAGTCAAT AGGGTGGCCG 


9060 


CATGCCTCAG 


AT^CTTGGG GTCCCGCCCT TGCGAGCTTG GAGACACCGG GCCCGGAGCG 


9120 


TCCGCGCTA6 GCTTCTGTCC AGGGGA6GCA GGGCTGCCAT ATGTGGCAAG TACCTCTTCA 


9180 


ACT6GGCAGT 


AAGAACAAAG CTCAAACTCA CTCCAATAGC GGCCGCTGGC CGGCTGGACT 


9240 


TGTCCGGTTG 


GTTCACGGCT 6GCTACAGCG GGGGAGACAT TTATCACAGC GTGTCTCATG 


9300 


CCCGGCCCCG 


CTG6TTCTGG TTTTGCCTAC TCCTGCTCGC TGCAGGGGTA GGCATCTACC 


9360 


TCCTCCCCAA 


CCGGTGAAGA TTGGGCTAAC CACTCCAGGC CAATAGGCCA TCCCCT 


9416 


(2) INFORMATION FOR SEQ ID NO: 20: 





(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3011 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS : single 
(D) TOPOIiOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: N- terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
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Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn Thr Asn 
15 10 15 

Arg Arg Pro Gin Asp Val Glu Phe Pro Gly Gly Gly Gin lie Val Gly 
20 25 30 

Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro Arg Leu Gly Val Arg Ala 
35 40 45 

Thr Arg Lys Thr Ser Glu Arg Ser Gin Prp Arg Gly Arg Arg Gin Pro 
50 55 60 

lie Pro Lys Ala Arg Arg Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly 
65 70 75 80 

Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp 
85 90 95 

Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 
100 105 110 

Arg Arg Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu Thr Cys 
115 120 125 

Gly Phe Ala Tlsp Leu Met Gly Tyr lie Pro Leu Val Gly Ala Pro Leu 
130 135 140 

Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val Arg Val Leu Glu Asp 
145 150 155 160 

Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly Cys Ser Phe Ser lie 
165 170 175 

Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val Pro Ala Ser Ala Tyr 
180 185 190 

Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val Thr Asn Asp Cys Pro 
195 200 205 

Asn Ser Ser lie Val Tyr Glu Ala Ala Asp Ala lie Leu His Thr Pro 
210 215 220 

Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ala Ser Arg Cys Trp Val 
225 230 235 240 

Ala Val Thr Pro Thr Val Ala Thr Arg Asp Gly Lys Leu Pro Thr Thr 



245 



250 



255 



Gin Leu Arg Arg His lie Asp Leu Leu Val Gly Ser Ala Thr Leu Cys 
260 265 270 
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Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Val Gly 
275 280 285 

Gin Leu Phe Thr Phe Ser Pro Arg Arg His Trp Thr Thr Gin Ser Cys 
290 295 300 

Asn Cys Ser lie Tyr Pro-Gly His He Thr Gly His Arg Met Ala Trp 
305 310 315 320 

Asp Met Met Met Asn Trp Ser Pro Thr Ala Ala Leu Val Val Ala Gin 
325 330 335 

Leu Leu Arg He Pro Gin Ala He Met Asp Met He Ala Gly Ala His 
340 345 350 

Trp Gly Val Leu Ala Gly He Ala Tyr Phe Ser Met Val Gly Asn Trp 
355 360 365 

Ala Lys Val Leu Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala Glu 

370 375 380 

Thr His Val Thr Gly Gly Ser Ala Gly His Thr Thr Ala Gly Leu Val 
335 390 395 4OO 

Gly Leu Leu Thr Pro Gly Ala Lys Gin Asn He Gin Leu He Asn Thr 
405 410 415 

Asn Gly Ser Trp His He Asn Ser Thr Ala Leu Asn Cys Asn Asp Ser 
420 425 430 

Leu Thr Thr Gly Trp Leu Ala Gly Leu Phe Tyr Arg His Lys Phe Asn 
435 440 445 

Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys Arg Arg Leu Thr Asp 
450 455 460 

Phe Ala Gin Gly Trp Gly Pro He Ser Tyr Ala Asn Gly Ser Gly Leu 
465 470 475 48O 

Asp Glu Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Pro Cys Gly He 
485 490 495 

Val Pro Ala Lys Ser Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser 
500 505 510 

Pro Val Val Val Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr Tyr Ser 
515 520 525 



Trp Gly Ala Asn Asp Thr Asp Val Phe Val Leu Asn Asn Thr Arg Pro 
530 535 540 
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Pro Leu Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Ser Thr Gly Phe 

545 550 555 560 

Thr Lys Val Cys Gly Ala Pro Pro Cys Val lie Gly Gly Val Gly Asn 

565 570 575 

Asn Thr Leu Leu Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu Ala 

580 585 590 

Thr Tyr Ser Arg Cys Gly Ser Gly Pro Trp He Thr Pro Arg Cys Met 



Thr He Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg Leu 
625 630 635 640 

Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu Glu Asp 
645 650 655 

Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Gin Trp 
660 665 670 

Gin Val Leu Pro Cys Ser Phe Thr Thr Leia Pro Ala Leu Ser Thr Gly 
675 680 665 

Leu He Hia Leu His Gin Asn He Val Asp Val Gin Tyr Leu Tyr Gly 
690 695 700 

Val Gly Ser Ser He Ala Ser Trp Ala He Lys Trp Glu Tyr Val Val 
705 710 715 720 

Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys Ser Cys Leu Trp 
725 730 735 

Met Met Leu Leu He Ser Gin Ala Glu Ala Ala Leu Glu Asn Leu Val 
740 745 750 

He Leu Asn Ala Ala Ser Leu Ala Gly Thr His Gly Leu Val Ser Phe 
755 760 765 

Leu Val Phe Phe Cys Phe Ala Trp Tyr Leu Lys Gly Arg Trp Val Pro 
770 775 780 

Gly Ala Val Tyr Ala Phe Tyr Gly Met Trp Pro Leu Leu Leu Leu Leu 
785 790 795 800 



595 



600 



605 



Val Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr He Asn Tyr 
610 • 615 620 



Leu Ala Leu Pro Gin Arg Ala Tyr Ala Leu Asp Thr Glu Val Ala Ala 
805 810 815 
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Ser Cys Gly Gly Val Val Leu Val Gly Leu Met Ala Leu Thr Leu Ser 
820 825 830 

Pro Tyr Tyr Lys Arg Tyr He Ser Trp Cys Met Trp Trp Leu Gin Tyr 
835 840 845 

Phe Leu Thr TVrg Val Glu Ala Gin Leu His Val Trp Val Pro Pro Leu 
850 855 860 

Asn Val Arg Gly Gly Arg Asp Ala Val He Leu Leu Met Cys Val Val 
• 870 875 880 

His Pro Thr Leu Val Phe T^p He Thr Lys Leu Leu Leu Ala He Phe 
885 890 895 

Gly Pro Leu Trp He Leu Gin Ala Ser Leu Leu Lys Val Pro Tyr Phe 
900 905 910 

Val Arg Val Gin Gly Leu Leu Arg He Cys Ala Leu Ala Arg Lys He 
915 920 925 

Ala Gly Gly His Tyr Val Gin Met Ala He He Lys Leu Gly Ala Leu 
930 935 940 

Thr Gly Thr Tyr Val Tyr Asn His Leu Thr Pro Leu Arg Asp Trp Ala 
945 950 955 960 

His Asn Gly Leu Arg Asp Leu Ala Val Ala Val Glu Pro Val Val Phe 
965 970 975 

Ser Arg Met Glu Thr Lys Leu He Thr Trp Gly Ala Asp Thr Ala Ala 
980 985 990 

Cys Gly Asp He He Asn Gly Leu Pro Val Ser Ala Arg Arg Gly Gin 
995 1000 1005 

Glu He Leu Leu Gly Pro Ala Asp Gly Met Val Ser Lys Gly Trp Arg 
1010 1015 1020 

Leu Leu Ala Pro He Thr Ala Tyr Ala Gin Gin Thr Arg Gly Leu Leu 
1025 1030 1035 1040 

Gly Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu 
1045 1050 1055 

Gly Glu Val Gin He Val Ser Thr Ala Thr Gin Thr Phe Leu Ala Thr 
1060 1065 1070 

Cys lie Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Thr Arg 
1075 1080 1085 
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Thr lie Ala Ser Pro Lys Gly Pro Val lie Gin Met Tyr Thr Asn Val 
1090 1095 1100 

Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin Gly Ser Arg Ser Leu 
X105 1110 1115 1121 

Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His 
1125 1130 1135 

Ala Asp Val lie Pro Val.. Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu 



Leu Ser Pro Arg Pro lie Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro 
1155 1160 . 1165 

Leu Leu Cys Pro Ala Gly His Ala Val Gly Leu Phe Arg Ala Ala Val 
1170 1175 1180 

Cys Thr Arg Gly Val Thr Lys Ala Val Asp Phe He Pro Val Glu Asn 
1185 1190 1195 120( 

Leu Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro 
1205 1210 1215 

Pro Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr 
1220 1225 1230 

Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly 
1235 1240 1245 

Tyr Lys Val Leu Val Leu Asn Pro Ser Val Tlla Ala Thr Leu Gly Phe 
1250 1255 1260 

Gly Ala Tyr Met Ser Lys Ala His Gly Val Asp Pro Asn He Arg Thr 
1265 1270 1275 128( 

Gly Val Arg Thr He Thr Thr Gly Ser Pro He Thr Tyr Ser Thr Tyr 
1285 1290 1295 

Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He 
1300 1305 1310 

He He Cys Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly 
1315 1320 1325 

He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val 
1330 1335 1340 



1140 



1145 



1150 



Val Leu Ala Thr Ala Thr Pro Pro Gly Ser 
1345 1350 



Val Thr Val Ser His Pro 
1355 1360 
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Asn He Glu Glu Val Ala Leu Ser Thr Thr Gly Glu lie Pro Phe Tyr 
1365 1370 1375 

Gly Lys Ala He Pro Leu Glu Val He Lys Gly Gly Arg His Leu He 
^^80 1385 1390 

Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val 
1395 1400 1405 

Ala Leu Gly lie Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser 
1410 1415 1420 

Val He Pro Thr Asn Gly Asp Val Val Val Val Ser Thr Asp Ala Leu 
1425 1430 1435 1440 

Met Thr Gly Phe Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr 
1445 1450 1455 

Cys Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He 
1460 1465 1470 

Glu Thr Thr Thr Leu Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg 
1475 1480 1485 

Gly Arg Thr Gly Arg Gly Lys Pro Gly He Tyr Arg Phe Val Ala Pro 
1490 1495 1500 

Gly Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys 
1505 1510 1515 1520 

Tyr Asp Ala Gly Cys Ala Trp Tyr Glu Leu Met Pro Ala Glu Thr Thr 
1525 1530 1535 

Val Arg Leu Arg Ala Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin 
1540 1545 1550 

Asp His Leu Glu Phe Trp Glu Gly Val Phe Thr Gly Leu Thr His He 
1555 1560 1565 

Asp Ala His Phe Leu Ser Gin Thr Lys Gin Ser Gly Glu Asn Phe Pro 
1570 1575 1580 

Tyr Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro 
1585 1590 1595 i600 

Pro Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro 
1605 1610 1615 

Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin 
1620 1625 1630 
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Asn Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Thr Cys 
1635 1€40 1645 

Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly 
1650 1655 . 1660 

Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val 
1665 1670 1675 1680 

Val He Val Gly Arg He Val Leu Ser Gly Lys Pro Ala He He Pro 
1685 1690 1695 

Asp Arg Glu Val Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ser 
1700. 1705 '1710 

Gin His Leu Pro Tyr He Glu Gin Gly Met Met Leu Ala Glu Gin Phe 
1715 1720 1725 

Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg His Ala Glu 
1730 1735 1740 

Val lie Thr Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Val Phe 
1745 1750 1755 1760 

Trp Ala Lys His Met Trp Asn Phe He Ser Gly. He Gin Tyr lieu Ala 
1765 1770 1775 

Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala 
1780 1785 1790 

Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Gly Gin Thr Leu lieu 
1795 1800 1805 

Phe Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Ala Pro Gly 
1810 1815 1820 

Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly 
1825 1830 1835 1840 

Ser Val Gly Leu Gly Lys Val Leu Val Asp lie Leu Ala Gly Tyr Gly 
1845 1850 1855 

Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu 
I860 1865 1870 

Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser 
1875 1880 1885 



Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg 
1890 1895 1900 
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His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu lie 
1905 1910 1915 1920 

Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro 
1925 1930 . 1935 

Glu Ser Asp Ala Ala Ala Arg Val Thr Ala lie Leu Ser Ser Leu Thr 
1940 1945 1950 

Val Thr Gin lieu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu Cys 
1955 1960 1965 

Thr Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Trp He 
1970 1975 1980 

Cys Glu Val Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met 
1985 1990 1995 2000 

Pro Gin Leu Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Arg 
2005 2010 2015 

Gly Val Trp Arg Gly Asp Gly He Met His Thr Arg Cys His Cys Gly 
2020 2025 2030 

Ala Glu He Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly 
2035 2040 2045 

Pro Arg Thr Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala 
2050 2055 2060 

Tyr Thr Thr Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Lys Phe 
2065 2070 2075 2080 

Ala Leu Trp Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Arg Val 
2085 2090 2095 

Gly Asp Phe His Tyr Val Ser Gly Met Thr Thr Asp Asn Leu Lys Cys 
2100 2105 2110 

Pro Cys Gin He Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val 
2115 2120 2125 

Arg Leu His Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu 
2130 2135 2140 

Val Ser Phe Ai^ Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu 
2145 2150 2155 2160 



Pro Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met 
2165 2170 



Leu Thr 
2175 
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Asp Pro Ser His lie Thr Ala Glu Ala Ala Gly Arg Arg Leu Ala Arg 
2180 2185 2190 

Gly Ser Pro Pro Ser Met Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala 
2195 2200 2205 

Pro Ser Leu Lys Ala Thr Cys Thr Ala Asn His hap Ser Pro Asp Ala 
2210 2215 2220 



Glu L»eu lie Glu Ala Asn L6u Leu 
2225 2230 

lie Thr Arg Val Glu Ser Glu Asn 

2245 

Asp Pro Leu Val Ala Glu Glu Asp 
2260 



Trp Arg Gin Glu Met Gly Gly T^n 
22*35 2240 

Lys Val Val lie Leu Asp Ser Phe 
2250 2255 

Glu Arg Glu Val Ser Val Pro Ala 
2265 2270 



Glu lie Leu Arg Lys Ser Arg Arg Phe Ala Arg Ala Leu Pro Val Trp 
2275 2280 2285 

Ala Arg Pro Asp Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro . 
2290 2295 2300 

Asp Tyr Glu Pro Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Arg 
2305 2310 2315 2320 

Ser Pro Pro Val Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr 
2325 2330 2335 

Glu Ser Thr Leu Pro Thr Ala Leu Ala Glu Leu Ala Thr Lys Ser Phe 
2340 2345 2350 



Gly Ser Ser Ser Thr Ser Gly He Thr Gly Asp Asn Met Thr Thr Ser 
2355 2360 2365 

Ser Glu Pro Ala Pro Ser Gly Cys Pro Pro Asp Ser Asp Val Glu Ser 
2370 2375 2380 

Tyr Ser Ser Met Pro Pro I*eu Glu Gly Glu Pro Gly Asp Pro Asp Phe 
2385 2390 2395 2400 

Ser Asp Gly Ser Trp Ser Thr Val Ser Ser Gly Ala Asp Thr Glu Asp 
2405 2410 2415 

Val Val Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu Val Thr 
2420 2425 2430 



Pro Cys Ala Ala Glu Glu Gin Lys Leu Pro He Asn Ala Leu Ser Asn 
2435 2440 2445 
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Ser Leu Leu Arg His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser 
2450 2455 2460 

Ala Cys Gin Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu 
2465 2470 2475 2480 

Asp Ser His Tyr Gin Asp Val Leu Lys Glu Val Lys Ala Ala Ala Ser 
2485 2490 2495 

Lys Val Lys ;U.a Asn Leu Leu Ser Val Glu Glu Ala Cys Ser Leu Thr 
2500 2505 2510 

Pro Pro His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val 
2515 2520 2525 

Arg Cys His Ala Arg Lys Ala Val Ala His lie Asn Ser Val Trp Lys 
2530 2535 2540 

Asp Leu Leu Glu Asp Ser Val Thr Pro He Asp Thr He He Met Ala 
2545 2550 2555 2560 

Lys Asn Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro 
2565 2570 2575 

Ala Arg Leu He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys 
2580 2585 2590 

Met Ala Leu Tyr Asp Val Val Ser Lys Leu Pro Leu Ala Val Met Gly 
2595 2600 2605 

Ser Ser Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu 
2610 2615 2620 

Val Gin Ala Trp Lys Ser Lys Lys Thr Pro Met Gly Phe Pro Tyr Asp 
2625 2630 2635 2640 

Thr Arg Cys Phe Asp Ser Thr Val Thr Glu Ser Asp He Arg Thr Glu 
2645 2650 2655 

Glu Ala He Tyr Gin Cys Cys Asp Leu Asp Pro Gin Ala Arg Val Ala 
2660 2665 2670 

He Lys Ser Leu Thr Glu Arg Leu Tyr Val Gly Gly Pro Leu Thr Asn 
2675 2680 2685 

Ser Arg Gly Glu Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val 
2690 2695 2700 

Leu Thr Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr He Lys Ala Arg 
2705 2710 2715 2720 
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Ala Ala Arg Arg Ala Ala Gly Leu Gin Asp Cys Thr Met Leu Val Cys 
2725 2730 2735 

Gly Asp Asp Leu Val Val lie Cys Glu Ser Ala Gly Val Gin Glu Asp 
2740 2745 2750 

Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala 
2755 2760 2765 

Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr 
2770 2775- 2780 

Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg 
2785 2790 2795 280C 

Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala 
2805 2810 2815 

Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn lie 
2820 2825 2830 

He Met Phe Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His 
2835 2840 2845 

Phe Phe Ser Val Leu He Ala Arg Asp Gin Leu Glu Gin Ala Leu Asn 
2850 2855 2860 

Cys Glu He Tyr Ala Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro 
2865 2870 2875 2B8( 

Pro He He Gin Arg Leu His Gly Leu Ser Ala Phe Leu Leu His Ser 
2885 2890 _ 2895 

Tyr Ser Pro Gly Glu Val Asn Arg Val Ala Ala Cys Leu Arg Lys Leu 
2900 2905 2910 

Gly Val Pro Pro Leu Arg Ala Trp Arg His Arg Ala Arg Ser Val Arg 
2915 2920 2925 

Ala Arg Leu Leu Ser Arg Gly Gly Arg Ala Ala He Cys Gly Lys Tyr 
2930 2935 2940 

Leu Phe Asn Trp Ala Val Arg Thr Lys Leu Lys Leu Thr Pro He Ala 
2945 2950 2955 2961 

. Ala Ala Gly Arg Leu Asp Leu Ser Gly Trp Phe Thr Ala Gly Tyr Ser 



2965 



2970 



2975 



Gly Gly Asp He Tyr His Ser Val Ser His Ala Arg Pro Arg Trp Phe 
2980 2985 2990 
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Trp Phe Cys Leu Leu Leu Leu Ala Ala Gly Val Gly He Tyr Leu Leu 
2995 3000 3005 



Pro Asn Arg 
3010 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : dO\lble 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
TGTCGCATTC 10 
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WHAT IS CLAIMED IS: 



1. A genetically engineered hepatitis C virus (HCV) nucleic acid clone which 
comprises from S' to 3' on the positive-sense nucleic acid a functional 5' non-translated 
region (NTR) comprising an extreme 5'-terminal conserved sequence, an open reading 
frame (ORF) encoding at least a portion of an HCV polyprotein whose cleavage 
products form functional components of HCV virus particles and RNA replication 
machinery, and a 3' non-translated region (NTR) comprising an extreme 3 '-terminal 
conserved sequence, or a derivative thereof selected from the group consisting of 
adapted virus, live-attenuated virus, replication-competent non-infectious virus, and 
defective virus. 



2. The HCV nucleic acid of claim 1 which has a consensus nucleic acid sequence 
determined from the sequence of a majority of at least three clones of an HCV isolate 
or genotype. 

3. The HCV nucleic acid of claim 2 having at least a functional portion of a 
sequence as shown in SEQ ID N0:1. 

4. The HCV nucleic acid of claim 1 or 3, wherein a region from an HCV isolate is 
substituted for a homologous region. 

5. The HCV nucleic acid of claim 1 which is a DNA that codes on expression for 
a replication-competent HCV RNA replicon, or which is a replication-competent HCV 
RNA repiicon. 

6. An HCV nucleic acid of claim 1, 3, or 5 which has the full length sequence as 
depicted in or corresponding to SEQ ID NO: 1 . 
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7. The HCV nucleic acid of claim 1 wherein the 5 '-terminal sequence is 
homologous or complementary to an RNA sequence selected from the group consisting 
of GCCAGCC; GGCCAGCC; UGCCAGCC; AGCCAGCC; AAGCCAGCC; 
GAGCCAGCC; GUGCCAGCC; and GCGCCAGCC, wherein the sequence 
GCCAGCC is the 5 '-terminus of SEQ ID N0:3. 

8. The HCV nucleic acid of claim 1 wherein the 3'-NTR extreme terminus is 
homologous or complementary to a DNA having the sequence 

5'-GGTGGCTCCATCTTAGCCCTAGTCACGGCTAGCTGTGAAAGGTCCGTGAG 

CCGCATGACTGCAGAGAGTGCTGATACTGGCCTCTCTGCTGATCATGT-3' 

(SEQIDNO:4). 

9. The HCV nucleic acid of claim 1 wherein the 3'-NTR comprises a long poly- 
pyrimidine region. 

10. The HCV nucleic acid of claim 1, 3, or 5 further comprising a heterologous 
gene operatively associated with an expression control sequence, wherein the 
heterologous gene and expression control sequence are oriented on the positive-strand 
nucleic acid molecule. 

11. The HCV nucleic acid of claim 10 wherein the heterologous gene is inserted by 
a strategy selected from the group consisting of: 

a) in-frame fusion with the HCV polyprotein coding sequence; and 

b) creation of an additional cistron. 

12. The HCV nucleic acid of claim 10, wherein the heterologous gene is an 
antibiotic resistance gene or a reporter gene. 
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13. The HCV nucleic acid of claim 1 1 , wherein the antibiotic resistance gene is 
neomycin resistance gene operatively associated with an internal ribosome entry site 
(IRES) inserted in an Sfil site in the 3'-NTR. 

14. The HCV nucleic acid of claim 1, 3, or 5 which is selected from the group 
consisting of double stranded DNA, positive-sense cDNA, or negative-sense cDNA. 

15. The HCV nucleic acid of claim 1, 3, or 5 which is positive-sense RNA or 
negative-sense RNA. 

16. The HCV DNA of claim 14 further comprising a promoter 5' of the 5'-NTR on 
positive-sense DNA, whereby transcription of template DNA from the promoter 
produces replication-competent RNA. 

17. A plasmid clone harboring a full-length HCV cDNA which can be transcribed 
to produce infectious HCV RNA transcripts as deposited with the American Type 
Culture Collection and assigned accession no. 97879, having a sequence as dq)icted in 
SEQ ID N0:5, or a derivative thereof selected from the group consisting of 

a) a derivative wherein a 5 '-terminal sequence is homologous or 
complementary to an RNA sequence selected from the group consisting of 
GCCAGCC, GGCCAGCC, UGCCAGCC, AGCCAGCC, AAGCCAGCC, 
GAGCCAGCC, GUGCCAGCC, and GCGCCAGCC, wherein the sequence 
GCCAGCC is the 5'-terminus of SEQ ID N0:3; and 

b) a derivative wherein a 3'-NTR comprises a short poly-pyrhnidine 
region. 

18. A plasmid clone harboring a full-length HCV cDNA which can be transcribed 
to produce infectious HCV RNA transcripts as deposited with the American Type 
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Culture Collection and assigned accession no. 97879, having a sequence as depicted in 
SEQ ID NO: 5, or a derivative thereof selected from the group consisting of 

a) a derivative produced by substitution of homologous regions from other 
HCV isolates or genotypes; 

b) a derivative produced by mutagenesis; 

c) a derivative selected from the group consisting of adapted, 
live-attenuated, replication competent non-infectious, and defective variants; 

d) a derivative comprising a heterologous gene operatively associated with 
an expression control sequence; 

e) a derivative consisting of a functional fragment of any of the 
abovementiooed derivatives. 

19. An HCV DNA or RNA transcribed from the full length HCV cDNA harbored 
in the plasmid clone of claim 17 or 18. 

20. A method for identifying a cell line that is permissive for infection with HCV, 
comprising contacting a cell line in tissue culture with an infectious amount of the HCV 
RNA of claim IS, and detecting replication of HCV in cells of the cell line. 

21. A method for identifying a cell line that is permissive for infection with HCV, 
comprising contacting a cell line in tissue culture with an infectious amount of an 
infectious HCV RNA of claim 19 under conditions that select for cells that express the 
heterologous expression control sequence. 

22. A mediod for identifying an animal that is permissive for infection with HCV, 
comprising introducing an infectious amount of the HCV RNA of claim 15 to the 
animal, and detecting replication of HCV in the animal. 
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23. A method for selecting for HCV with adaptive mutations that permit higher - 
levels of HCV replication in a permissive cell line comprising contacting a cell line in 
culture with an infectious amount of the HCV RNA of claim 15, and detecting 
progressively increasing levels of HCV RNA in the cell line. 

24. The method according to claim 23, wherein the adaptive mutation permits 
modification of HCV tropism. 

25. A host cell line transfected, transformed, or transduced with the HCV DNA of 
claim 16. 

26. The host cell line of claim 25 selected from the group consisting of a bacterial 
cell, a yeast cell, a plant cell, an msect cell, and a mammalian cell. 

27. A method for infecting an animal with HCV which comprises administering an 
infectious dose of HCV RNA of claim 15 to the animal. 

28. A method for infecting an animal with HCV which comprises administering an 
infectious dose of HCV RNA of claim 19 to the animal. 

29. A non-human anunal infected with HCV, wherein the HCV has a genomic RNA 
sequence corresponding to the HCV nucleic acid of claim 1, 3, or 5. 

30. A method for propagating HCV in vitro comprising culturing a cell line 
contacted with an infectious amount of HCV RNA of claim 15 under conditions that 
permit replication of the HCV RNA, 



31. 



A method for propagating HCV in vitro comprising culturing a cell line 
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contacted with an infectious amount of HCV RNA of claim 19 under conditions that 
permit replication of the HCV RNA. 

32. An in vitro cell line infected with HCV, wherein the HCV has a genomic RNA 
sequence corresponding to the HCV nucleic acid of claim 1, 3, or 5. 

33. The cell line of claim 32 which is a hepatocyte cell line. 

34. A method for transducing an animal susceptible to HCV infection with a 
heterologous gene, comprising administering an amount of the HCV nucleic acid of 
claim 10 to the animal effective to infect the animal with the HCV. 

35. A method for transducing an animal susceptible to HCV infection with a 
heterologous gene, comprising administering an amount of the HCV RNA of claim 19 
to the ammal effective to infect the animal with the HCV RNA. 

36. A method for producing HCV virus particles comprising isolating HCV virus 
particles from the HCV-infected non-human animal of claim 29. 

37. A method for producing HCV virus particles comprising: 

a) culturing the cell line of claim 25 under conditions that permit HCV 
replication and virus particle formation; and 

b) isolating HCV virus particles from the cell line culture. 

38. A mettiod for producing HCV virus particles comprising: 

a) culturing the ceil line of claim 32 under conditions that permit HCV 
replication and virus particle formation; and 

b) isolating HCV virus particles from the cell line culture. 
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39. A method for producing HCV particle proteins comprising: 

a) culturing a host expression cell line transfected with the HCV DNA of 
claim 16 under conditions that permit expression of HCV particle proteins; and 

b) isolating HCV particle proteins from the cell culture. 

40. An HCV virus particle comprising a replication-competent HCV genome RNA 
corresponding to the HCV nucleic acid of claim 1, 3, or 5. 

41. An HCV virus particle comprising a replication-defective HCV genome RNA 
corresponding to the HCV nucleic acid of claim 1, 3, or 5. 

42- An in vitro cell-free assay system for HCV comprising HCV genomic template 
RNA of claim IS, functional HCV replicase components, and an isotonic buffered 
medium comprising ribonucleotide triphosphate bases. 

43. An in vitro cell-free assay system for HCV comprising HCV genomic template 
RNA of claim 19, functional HCV replicase components, and an isotonic buffered 
medium comprising ribonucleotide triphosphate bases. 

44. A method for producing antibodies to HCV comprising administering an 
immunogenic amount of HCV virus particles of claim 41 to an animal, and isolating 
anti-HCV antibodies from the animal. 

45. A method for producing antibodies to HCV comprismg administering an 
immunogenic amount of HCV virus particles of claim 42 to an animal, and isolating 
anti-HCV antibodies from the animal. 

46. A method for producing antibodies to HCV comprising screening a human 
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antibody library for reactivity with HCV virus particles of claim 41 and selecting a - 
clone from the library that expresses an antibody reactive with the HCV virus particle. 

47. A method for producing antibodies to HCV comprising screening a human 
antibody library for reactivity with HCV virus particles of claim 42 and selecting a 
clone from the library that expresses an antibody reactive with the HCV virus particle. 

48. An HCV vaccine comprising HCV virus particles of claim 41 in a 
pharmaceutically acceptable adjuvant. 

49. An HCV vaccine comprismg HCV virus particles of claim 42 in a 
pharmaceutically acceptable adjuvant. 

50. A method for screening for agents capable of modulating HCV replication 
comprising: 

a) administering a candidate agent to an HCV infected animal of claim 29; 
and 

b) testing for an increase or decrease in a level of HCV infection or activity 
compared to a level of HCV infection or activity in die animal prior to 
administration of the candidate agent; 

wherein a decrease in the level of HCV infection or activity compared to the level of 
HCV mfection or activity in the animal prior to administration of the candidate agent is 
indicative of the ability of the agent to inhibit HCV infection or activity. 

5 1 . The method according to claim 47 wherein testing for the level of HCV 
infection is selected from the group consisting of: 

a) measuring viral titer in a tissue sanqjle from the animal; 

b) measuring viral proteins in a tissue sample from the animal; and 
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c) measuring liver enzymes. 

52. The method according to claim 50 wherein the HCV genome used to infect the 
animal includes a heterologous gene operatively associated with an expression control 
sequence, wherein the heterologous gene and expression control sequence are oriented 
on the positive-strand nucleic acid molecule, and wherein testing for the level of HCV 
activity comprises measuring the level of a marker protein in a tissue sample from the 
animal. 

53. A method for screening for agents capable of modulating HCV replication 
comprising: 

a) contacting the cell line of claim 32 with a candidate agent; and 

b) testing for an increase or decrease in a level of HCV infection or activity 
compared to a level of HCV infection or activity in a control cell line or in the 
cell line prior to administration of the candidate agent; 

wherein a decrease in the level of HCV infection or activity compared to the level of 
HCV infection or activity in a control cell line or in the cell line prior to administration 
of the candidate agent is indicative of the ability of the agent to inhibit HCV infection 
or activity. 

54. The method according to claim 53 wherein testing for the level of HCV 
infection is selected from the group consisting of: 

a) measuring viral titer in the cells, culture medium, or both; and 

b) measuring viral proteins in the cells, culture medium, or both. 

55. The method according to claim 53 wherein the HCV genome used to infect the 
cell line includes a heterologous gene operatively associated with an expression control 
sequence, wherein the heterologous gene and expression control sequence are oriented 
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on the positive-strand nucleic acid molecule, and wherein testing for the level of HGV 
activity comprises measuring the level of a marker protein in a tissue sample from the 
animal. 

56. A method for screening for agents capable of modulating HCV replication 
comprising: 

a) contacting the in vitro system of claim 43 with a candidate agent; and 

b) testing for an increase or decrease in a level of HCV replication 
compared to a level of HCV replication in a control cell system or system prior 
to administration of the candidate agent; 

wherem a decrease in the level of HCV replication compared to the level of HCV 
replication in a control cell line or in the cell line prior to administration of the 
candidate agent is indicative of the ability of the agent to inhibit HCV infection or 
activity. 

57. A method for preparing an HCV nucleic acid con^jrising joining from 5' to 3' 
on the positive-sense DNA a functional 5' non-translated region (NTR) comprising an 
extreme 5 '-terminal conserved sequence, a polyprotein coding region encoding HCV 
proteins that provide for expression of functional HCV proteins, and a 3' nonr 
translated region (NTR) comprising an extreme 3'-terminal conserved sequence. 

58. The method according to claim 56 further comprising determining a consensus 
sequence for the 5'-NTR, polyprotein coding sequence, and 3'-NTR from a majority 
sequence of at least three clones of an HCV isolate or genotype. 

59. The method according to claim 56 wherein the 3'-NTR comprises an extreme 
terminal sequence homologous to a DNA having the sequence 

5'-GGTGGCTCCATCTTAGCCCTAGTCACGGCTAGCTGTGAAAGGTCCGTGAG 
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CCGC ATGACTGC AGAG AGTGCTGAT ACTGGCCTCTCTGCTG ATC ATGT-3 ' - 
(SEQ ID N0:4). 

60. The method according to claim 56 wherein the HCV nucleic acid has a positive 
strand sequence as depicted in or corresponding to SEQ ID N0:1 comprising 
substitution of a homologous region from another HCV isolate or genotype. 

61. An in vitro method for detecting antibodies to HCV in a biological sample from 
a subject comprising: 

a) contacting a biological sample from a subject with HCV virus particles 
of claim 41 under conditions that permit binding of HCV-specific antibodies in 
the sample to the HCV virus particles; and 

b) detecting binding of antibodies in the sample to the HCV virus particles, 
wherein detecting binding of antibodies in the sample to the HCV virus particles is 
indicative of the presence of antibodies to HCV in the sample. 

62. An in vitro method for detecting antibodies to HCV in a biological sample from 
a subject comprising: 

a) contacting a biological sample from a subject with HCV virus particles 
of claim 42 under conditions that permit binding of HCV-specific antibodies in 
the sample to the HCV virus particles; and 

b) detecting binding of antibodies in the sa^^)le to the HCV virus particles, 
whercin detecting binding of antibodies in the. sample to the HCV vims particles is 
indicative of the presence of antibodies to HCV in the sample. 

63 . An in vitro method for detecting the presence of HCV in a biological sample 
from a subject comprising: 

a) contacting a cell line permissive for productive HCV infection with a 
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biological sample, wherein the cell line has been modified to contain a transgene 
that express a reporter gene product expressed under control of a trans-acting 
factor produced by HCV; and 

b) detecting expression of the reporter gene product, 
wherein detection of expression of the reporter gene product is indicative of the 
presence of HCy in the biological sample from the subject. 

64. An in vitro method for detecting the presence of HCV in a biological sample 
from a subject comprising: 

a) contacting a cell line permissive for productive HCV infection with a 
biological sample, wherein the cell line has been modified to contain a defective 
virus transgene, which defective virus transgene will express a reporter gene 
product at high levels under control of a trans-acting factor produced by HCV; 
and 

b) detecting expression of the reporter gene product, 

wherein detection of expression of die reporter gene product is indicative of the 
presence of HCV m the biological sample from the subject. 

65. The method according to claim 64, wherein the defective viral transgene 
produces an engineered alphavirus, the trans-acting helper factor is alphavirus nsP4 
polymerase, and wherein the alphavirus nsP4 polymerase is expressed as a chimeric 
fusion protein with HCV NS4A, such that the alphavirus nsP4 polymerase-HCV NS4A 
chimeric fusion protein is cleaved by HCV NS3 proteinase to release functional 
alphavirus nsP4 polymerase. 

66. The method according to claim 63 or 64 wherein the biological sample is 
selected from the group consisting of blood, serum, plasma, blood cells, lymphocytes, 
and liver tissue biopsy. 
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67. A test kit for HCV comprising authentic HCV virus components. 

68. A diagnostic test kit for HCV comprising components derived from an authentic 
HCV virus. 



wo 98/39031 ^ , ^ PCT/DS98W4428 

1113-1-006 




Figure 2 



wo 98/39031 



PCT/US98/04428 






Figure 5 



wo 98/39031 PCT/US98/04428 

1113-1-006 



forward 
primer 



PGR 

fragment A 



PGR 

fragment B 



reverse 
primer 



Reactioa mixture 
before PGR 





<-lllL 










iiiiimiiiiiiiiH 


iiiiiiiiiiuimi 


■ ■■■imiiiiiiiiiiiiiiiiiiiiiii 



First cyde 
of PGR 



} 



Second and fiulher 
cydes 



Result of PGR 



Figure 6 



wo 98/39031 r*^r.rr. 

PCT/US98/04428 



N Q 



CO 



CO 




wo 98/39031 



AA 



PCT/US98/04428 

1113-1-006 



»248 

#227 

#213 \ 

#211 

#209 \\ 

#12 \\\\\\\\\\\\\\\\\\\\\\\\^ ' 

GenBank 

pcR-seq !!!!!!!!!!!!!! 

cons 

564 ACCTCXXXrrcAGCCCGGCrACOOTGCKrcCCTCTATGGCA^ g23 
75TWAQPGYPWPLYGMEOC6WA 94 

#248 

#227 

#213 \\'\ 

#211 

#209 ...C 

#12 \\\\\\\\\\\\\\\\\\\\\\\\\\\\ 

GenBank 

cons 

fi24 GGATGOCTOrrGTCTCCCCGTGQCrr^^ 6^3 
95 0WLLSPRGSRPSWGPTDPRR 114 

#248 , 

#227 

#213 

#211 ' 

#209 \\ 

•12 A !!!!!;;!!!!!!;!!;;; 

GenBank ^ ^ 

cons 

684 AGGTCGCGCAATTra x natfUaGTCATCGATACCCTTACGTO^^ 743 
115 RSRKLGKVZDTLTCGPADLM 134 

#248 

#227 

#213 

#211 

#209 

#12 !!!!!!!!!!!!!!!!!!!! 

GenBank !!!!!!!!!!! 

cons !!!!!!!!!!!!!!!!!!!!!! 

135 0 Y I P^^^^L^^^v'^G^^ JsJ 

#248 

#227 WW 

#213 

#211 t 

#209 t 

#12 

GenBank !!!!!!!!!!!!!! 

cons !!!!!!!!!!!!!!! 

804 GTCCGGGTTCTGGAAGACGGCGTGAACTATGCAACAGGGAACCTTOT 863 
155VRVLBDGVNYATGNLPGCSF 174 



Figure 9 



W0 9?O^l PCT/US9»04428 

1113-1-006 



#248 

•227 

#213 <= 

•211 a 

•209 a 

•12 

CenBank 

cons 

864 TCTATCTTCCTrCTCXK:CCTXX:TCrCTTGCCTC 923 
175SIFLLALLSCLTVPASAYQV 194 



•248 

•227 <=• 

•213 «• 

•211 

•209 

•12 

GenBank ^• 



cons 



924 CGCAATTCCTCGGGGCTTTACCATGTCACX:AATGATTGCa:TAATTCGJV^ 983 
195RMSSGLyHVTNDCPN SSIVY 214 

•248 0 

•227 G • 

•213 

•211 a A 

•209 a A t 

•12 

GenBank 

cons . • • ••• 

984 GAGGCGGCCGATGCCATCCTGCACACTCCGGGGTGTGTCCCTTGCGTTCGCGAGOT 1043 
215EAADAILHTPGCVPCVREGN 234 



•248 

#227 G t 

#213 

#211 •0 

#209 

#12 

GenBank • ; 

cons . ,•.,.•..••.•..« ..••••«•••« 

1044 GCCTCGAGGTGTTGGGTGGCGGTGACCCCCACCGTQGCCAOCAG06ACGGCAAACTCCCC 1103 
235 ASRC WVAVTPTVATROGKLP 254 

#248 S--- 

•227 f 

•213 

•211 fl c g..- 

•209 c g.-. 

•12 O g.-- 

GenBank g-*- 

cons . • ■ * ■ » •^»g • • • 

1104 ACAACGCAGCTTCGACGTOlTATCGATCTGCTTGTaXSG^ 1163 
255 TTQLRRHXDliLV.GSATLCSA 274 



Figure 9 



wo 98/39031 



AC 



PCT/US98/04428 

1113-1-006 



♦248 

#227 c 

#213 c 

#211 * 

#209 c 

#12 e. 

GenBank c. j 1 [[[[ * 

cons c i !!!!!!!!!!!!!! ! 

1164 CTCTACGTCXSGGGAC(TOra: GGGTCT GTl V^ ^^ 1223 
275 LyVGDLCGSVFLVGQLFTPS 294 



♦248 GA 

#227 

#213 

#211 GA 

•209 , 

•12 GA 

GenBank qa. 

cons GA, 



1224 CCO^GGCGCCACTGGACGACGCAAAGCTGCAATTGTTCTATCTATC^ 1283 
295 PRRHWTTQSCMCSIYPCHIT 314 



#248 

#227 9 

#213 

#211 

#209 ' 

#12 i i i i i i ! i !T! ! ! 

GenBamk a 

cons 

1284 GGTCATCXXIATGGCATGGGATATGATGATCAACTGGTCCCCTAC^^ 1343 
315 0HRMAWDMMMMWSPTAALVV 334 



#248 - 

#227 

#213 

#211 

#209 

#12 !i!.!!!!!!!!!!!!!!!!!!c!!*! 

GenBank a 1 i ! i i !c! !c! !!!!!!! ! 

pcR-seq 

cons 

1344 GCTCAGC T G CT C( »G ATCCCACAAGdCATCATGGA^ 14(^3 
335 AQLLRIPQAIMDM1AGAHWG 354 



#248 

#227 T 

#213 T 

#211 

#209 

#12 

GenBank AA. 

PCR-seq 

cons 

1404 

355 VLAG1AYPSMVGNWAKVLVV 



1463 
374 



Figiire 9 



wo 98/39031 



1113-1-006 



PCTAJS98/04428 



AD 



#83 ^• 

^ :::::::::::a::::::::g: 

#86 

"7 ::::::::::::::::::::::!g: 

»89 

#90 ' 

#92 t [k'.y.'.v/.G. 

»w ::::::::::: a c. 

#95 

Ill :::::::::::::::::::::: ^ g. 

«59 A G. 

#101 

•2" ^ ;;:;;!::!!!g! 

•227 : A O. 

»213 V ...G. 

«2U 

#209 

n2 t [kv/z/.y/.G. 

GenBank - 

pcR-seq !!!!,!!!!! .!!g. 

^^464 CTOCTOTATTTGCC^^ 

375 LI-LPAOVDABTHVTGGSAGH 394 



#83 TA CT. .AC • 

#84 ^ ^ 

#86 c • * 

#87 ...G A...C T T 

#89 ....TA cT.-AC 

#90 TA cT.-AC - • 

#92 ...Gt C T T 

#93 

#95 

#96 ....TA cT.-AC 

#99 ; 

•101 ^ ^ 

#248 C T 

#227 

#213 ^ ' 

#211 ....TA CT..AC 

#209 C • 

#12 C T V/.',V/.\V.V.\ 

GenBank 

pcR-seq 1 !!!!...'!!.....!..'.!. i 

^''^524 ACCACGGCTOGG^^ 

395 TTAOLVGLLTPQAKQKIQI.I M 



Figure 9 



wo 98/39031 



PCT/US98/04428 



1113-1-006 



AE 



«83 - 

184 2*': 

186 — J*-* A. 

187 ^ ^'^^ A. 

•89 [[[ ; I 

190 I ^ 

•92 
•93 
•95 
#96 
#99 
•101 
#248 



•A. 
.A. 
-A. 

.A. 
.A. 



.A. 



•227 Z"^ ^* 

•213 ; ^""v 

•211 c ^ °- 

#209 ^ 

#12 t* 

GenBank [[[] " 

pcR-seq t a 

cons I!!!!!!!! t a a* 

S^Ji^™"^^^^^^^ 1643 
415NTNGSWKtNSTALHCNDSLT 434 

•83 Aa 

«4 Aa 

"6 Aa 

•87 G TA ' i 

#90 

#92 G A k 



#93 



.Ag. 



.G. 



*55 Aa, 

«6 Aa. 

«9 Ag. 

Aa. 

A.. 



#227 



.Ag. 



♦213 Aa Q 

#211 Aa 

#209 

#12 ' ' i ' * ! - ! !a! ! ! ! ! ! [ ! i ! ! i i i ! ! 

oenBonk " ] Ag! 

pcR-aeq ] //^ ^Ag! 

cons Ag ! ' 

1644 ACCGGCTCGTOAGC»OGGCTCTTCTATCGCa«:AAATTC^ 1703 
435 TGHLACLPYRHKPNSSGCPE 454 



Figure 9 



WOra/39031 

1113-1-006 

AF 



183 c c.t 

«84 t 

#86 t 

•87 c-,t C 

♦89 c c.t 

♦90 c c.t 

♦92 c c.t ..C 

♦93 T t 

#95 t 

#96 y a y..t 

#99 t 

#101 t 

#248 G j,t 

#227 t 

#213 t 

#211 c c.t 

#209 t t 

♦ 12 G t 

GenBank t 

PCR-sea ^ 

cons. t 



1704 AGGTTGGCXyiGCTCCay^CGCCTTACCXSATTTTGCCCAGa^^ 1763 
455 RLASCRRLTDPAQGWGPISY 474 



♦83 c c 

t84 C C..T A. 

♦86 c c A. 

♦87 Cc c A, 

♦89 c c A. 

♦90 c .c A, 

♦92 Cc c 

♦93 c e A, 

♦95 c c ; 

♦96 c c r. 

♦99 c e 

♦101 c c 

♦248 c.t c 

#227 c c C....A 

#213 c c 

♦211 c c c 

♦209 c c 

♦12 c 

GenBank c ....c 

PCR-seq c c". 

cons c c 



1764 GCCAACGGAAGCGGCCTTGACGAACGCCCCTACTGTTGGCACTACC^^ 1823 
475 ANGSCLDERPYCWHYPPRPC 494 



PCT/US98/04428 



Figure 9 



wo 98/39031 PCT/DS98/04428 



1113-1-006 



AG 



#248 t 

#227 ..t 

#213 

#211 ..t 

#209 

#12 

GenBank 

PCR*seq \ 

cons ][^] 

1824 GCXaiTTGTGCCCGCAAAGAGCGTOT(nXXSCCCGGTATATTGC 1883 
495 GIVPAKSVCGPVYCFTPSPV 514 



#248 

#227 

#213 

1211 

#209 

#12 g ]\\[ 

GenBank 

PCR-seq , 

cons , 

1864 GTGGIGGGAACGACCGACAGGTCGGGCGCOCCTACXTACAGCIGG^ 1943 
515VVGTTDRSGAPTYSWGAHDT 534 



#248 t A c 

#227 

#213 

#211 y.y.cW 

#209 

#12 t 

GenBank 

PCR-seq , , 

cons 

1944 GATGTCTTOSTCCTTAACAACACCAGGCCACCGCTCGGCAAT^^ 2003 
535 DVPVLNNTRPPLGNWFGCTW 554 



•248 

#227 

#213 G;..... 

#211 

#209 

#12 

GenBank 

PCR-fieq 

cons 

2004 ATQAACTCAACTGGATOaCCAAAGTGTGCGGAGCGCCCCCT^^ 2063 
555 HNST6FTKVCGAPFCVXGGV 574 



Figure 9 



wo 98/39031 PCT/US98/04428 

1113-1-006 



AH 



#248 t fl. 

#227 t g. 

#213 

#211 t g. 

#209 c... t g. 

#12 t g. 

GenBank T 

PCR-seq t ff- 

cons t ff* 



2064 GQCAACAACAOCTT O CTCT G CCCC^CTGA^ ro ^^ 2123 
575 GNMTLLCPTDCPRKHPEATy 594 



#248 A 

#227 A 

§213 

#211 A 

•209 A 

#12 A 

GenBank A 

PCR**8eq .•- R 

cons . • 

2124 TCTCGGTOCGGCTCOGGTCCCTGGATTACACCCAGGTGCATGCnC^ 2183 



595 SRCGSGPWITPRCMVDYPYR 614 



#248 C 

#227 c 

#213 c 

#211 c 

#209 c 

#12 c y 

GenBank c. . . . .* 

PCR-seq .c 

cons c Ill' ' * 

2184 CTTOX^CACTATCCTTGTACTATCAATTACACCATATTCAAAGTCAGGATGTACGT^^ 2243 
615LWHYPCTINYTIFKVRMYVG 634 



#248 

#227 

#213 

#211 

#209 

#12 

QenBank 

PCR-seq 

cons 

2244 GGGGTX:GAGCACAGGCTGGAACXXXXXrrGCAACTGGACGCG<^^ 2303 
635 GVEHRLEAAC NWTRGERCDL 654 



Figure 9 



wo 98/39031 «^„,„»„-...-„ 
■ - FCT/US98/D4428 



1113-1-006 



AI 



I24S 
#227 
#213 
#211 
«209 
«12 

GenBaxik 
cons* 



2304 GAAGACAGGOACACOTCCGArcTCAGCXCATTGCTGCTG^^ 
655BDRDRSBLSPLLLSTTQWQV 674 



2363 



#248 

#227 « V. 

#213 « 

#211 

#209 

#12 ! i i i ii! 

GeziBank , 

pcR-seq 

cons 

675 LPCSPTTLPALSTGtilHt,HQ 694 
#248 - 

#227 

#213 [[[ ^ 

#211 

#209 • 

#12 !!!!!!!!!!!!!!a 

GexiBanic i ^ - 1 !!!!!! !a ! 

pcR-seq !!!!!! [a 

cons !!!!!! !a! 

2424 AAOlTTGTGGACGTCXrAGTACTTCTACGG^^ 2483 
695 NIVDVQYLYGVGSSIASWAI 714 



#248 c 

#227 ['] 

#213 ' 

#211 [ * 

#209 - 

#12 i !c! ! 

GenBank t 

pcR-seq i. * 

cons 

2484 AAGTGGGAGTACGTCGUnViWl^T TC CTO^^ 2543 
715KWBYVVLLPLi:,LADARVCSC 734 



Figure 9 



wo 98/39031 PCT/US98/04428 

1113-1-006 



#248 G 

#227 A 

•213 :.,G 

•211 

•209 

• 12 ' 11 

GenBank 

PCR-seq 

cons 

2544 TTGTGGATGATGTTACTXyiTATCCCAAGCGGIUSGCGGCTTT^^ 2603 
735 LWHMLLXSQAEAALENLVIL 754 



#248 

•227 C t 

•213 

•211 

•209 

•12 

GenBank t 

PCR-seq 

cons 

2604 AATGCAGCATCCC TO ;; C CG GG ACGCAOCG T CTTGlX.''IXXn'lXXT^ 2663 
755 M A A S L A G T H G L V S F L V P P C F 774 



•248 T 

•227 

•213 T 

•211 

•209 

• 12 

GenBank • C 

cons 

2664 GCGTGGTATCTGAACGGTAGGTGGGTGCCCGGAGC»CTCTACGCCTTCTACGGGATGT^ 2723 
77SAWYLKGRWVP GAVYAFyGMW 794 



•248 G 

•227 

•213 

•211 t g 

•209 t ..g 

•12 C 1 .g 

GenBank ' 

cons 

2724 CCTCTC C TCCroCTC C TGC T GGCGiriGCCTCAGCGGGC^ 2783 
795 PLLLLLZiALPQRAYALDTBV 814 



Figure 9 



wo 98/39031 



PCT/US98/04428 



1113-1-006 



AK 



#248 

#227 • I 

#213 ] * ^ 

#211 * 

#209 * 

#12 ! 1 i ' * 1 i i i i i !l 

cons !i !! i i a 

2784 GOXXXmxnXMGCXsasaSTTCS^^ 2843 
SlSAASCCGVVLVGLMALTLSpy 834 

#248 

#227 

#213 ' 

#211 c 

#209 c ] 

#12 c !!!!!!!!!!!!!! !!!!!!! 

GenBank i'!-!'!!!!!!!!!!!!!!!!!! 

pcR-seq c !.!!!!!!!!!!! 

cons c 

2844 TACAAGCGCTATATCAGCraSTGCATGTCGTGG^^ 2903 
835 YXRYZSWClfWWLQYPLTRVB 854 

#248 

#227 

#213 !!!!!! 

#211 \[ 

#209 

#12 

GenBank i^iiiiili!!!!!!!!!!!!!'!"" 

pcR-seq !!!!!!!!!!! 

cons i i !!!!!! I !!!!!!!!! ! 

855 A Q L H ^^^^^^^'^'^'"^^ 2963 

#248 a c G 

#227 a c! 

#213 a c....!!! 

#211 L 

#209 * 

#12 c 

GenBank C a. G.c ^ i ! ^ ' i i ! 

pcR-seq r y !!!!!!!!!!!!!! 

cons iiiiiiii!!!!!!!!!!!!!!!!!! 

2964 TTACTCy^TGTGTGTTOTACACCCGACTCTGGTATTT^ 3023 
875 LLMCVVBPTLVPOITKLLLA 894 



Figure 9 



wo 98/39031 PCT/US9»04428 

1113-1-006 



AL 



#248 

#227 

#213 

•211 

#209 

#12 

GenEoxik 

PCR-seq 

^^024 ATOITCGGAC^ ^083 

895 IPGPLWILQASLLKVPYPVR 914 

#248 ° 1 

#22T ' 

•213 

#211 r 

•209 • 

#12 ° 

GenBank • 

PCR-B«q 



cons 

084 GTTCAAGGCCT ^ „ ^ „ 

915VQ G LLRICALARKIAGGHYV 934 



3084 GWAAGGCCTT^ ^^^^ 



#248 a 

#227 * 

•213 a 

•211 

♦209 * 

GenBank « ° 

PCR-seq * • * 

*^**^i44 CAAATCGCCATCATC^ ^^03 

935 QMAIIKLGALTGTYVYNHI.T 954 



•248 

•227 

•213 ^ 

•211 ® 

•209 : •••• 

•12 ^ 

Genfiank 

PCR-seq y ' 

955 rLRDWAHNOLRDLAVAVBPV 974 



Figure 9 



wo 98/39031 



PCTAJS98/04428 



Z.I/U3 



1113-1-006 



AM 



1248 
1227 
t213 
«211 
•209 
112 

GenBank 
PCR-fieq 
cons. 



3264 GTCTTCTCCCCyuvTGGAGACCA^ 
975 VFSRMETKLITWGADT A TTT VsV 



5T 3323 



#248 « 

#227 ^ 

#213 c 

#211 "[ : 

#209 

#12 !a! i i ^* " 

Genfiank I [ " 

pcR-seq * * 

cons ^* * * 

Hit ?^^^«=S*C?5CT^ 3383 
995 D1IHGLPVSARRGQEILLG P 1014 

#248 g 

#227 

#213 

#211 

#209 * 

#12 i i i 

GenBank 

PCR»seg ... 

cons 

?«fc f^2!^^^2^^^^^^^^ 3443 
1015 ADGMVSK6WRLLAP1T. AYAQ 1034 

ff248 Q 

#227 

#213 

#211 

#209 • 

#12 i 

GenBank 

cons i!!!!!!!!!!!'!! 

3444 CAGACGAGAGGCCTCCTAGGGTGTATAATCACC^^ 35O3 
1035 QTRGLLGCIITSLTORDKMQ 1054 



Figure 9 



wo 98^9031 PCT/US98/04428 

1113-1-006 



AN 



1248 

#227 a 

#213 

#211 

#209 

#12 

GenBank g 

cons 

3504 GTGGAGGGTCAGGTCCAGATCGTGTCAACnw:TACCCAAACCTTCC^ 3563 
loss VEGBVQIVSTATQTPLATCl 1074 

#248 

#227 " 

#213 g [ 

#211 .V. 

#209 

#12 

GenBank 

PCR-seq 

cons 

3564 AATGGGGTATGCIGGACTCTCTACCACGGGGCCGGJUVCGAGGACCA 3623 
1075 NGVCWTVYHGAGTRTIASPK 1094 

#246 

#227 c... 

#213 

#211 

#209 

#12 

GenBank ..C. t c 

PCR-seq 

cons 

3624 GGTCCTGTCATCCAGATGTATACCAATGTGGACCAAGACCTTGTGG 3683 
1095 G PVZQMYTNVDQOLVGWPAP 1U4 

#248 c c... 

#227 c.., 

#213 c... 

#211 c... 

#209 

#12 C... 

GenB&nk c... 

PCR-seq c. - • 

cons. ; c. . . 

3684 CAAGGTTCCa3CTCATTGACACCClXX:ACCTGCGGCa^^ 3743 
1115 QGSRSLTPCTCOSSOLYLVT 1134 

#248 t 

#227 t 

#213 t 

#211 G 

#209 

#12 t 

GenBank t 

PCR-seq t 

cons t , 

3744 AGGCACGCCGAOjTCATTCCCGTGCGCCGGCGAGGTGATAGCAGGGGTAGCCr^^ 3803 
1135 RHADVIPVRRRGDSRGSLLS 1154 



Figure 9 



wo 98/39031 PCT/US98/04428 

1113-1-006 



AO 



»248 t.g 

#227 t.g 

•213 t.g 

#211 t.g yy 

#209 t.g 

#12 . t.g [[[ 

GenBank t.g A 

PCR-seq t.g [ , [ 

cons t.g , ] 

3804 CCCCQGC0C^TTTOCTACCTAAAAGGCTCCTCGGGGGaT0CG C T g TTG^ 3863 
1155 PRPZSYLXOSSGGPLLCPAG 1174 

#248 G.t 

#227 G.t..r.- 

#213 , G.t 

#211 Q 

#209 a G 

#12 Gt '/.'.[[ 

GenBank , c.t 

PCR-seq G.t... 

cons O.t 

1175 HAVGLFRAAVCTROVTKAVD^ 1194 

#248 C. G 

•227 G 

•213 G 

•211 

•209 G 

• 12 G 

GenBank •••••• 

cons 



3924 irrATCCCTGTGGAGAACCTAGAGACAACOlTCAGATCCCCGGTGTTCACGGAC^ 3983 
1195 FZPVENLETTMRSPVPTDNS 1214 



•248 c 

•227 c 

•213 c 

•211 c 

•209 ...,C 

•12 C 

GenBank , c 

cons . ; c 

3984 TCTCCACCAGCAGTGCCXCAfiAGCTTCCAGGTCGCCCaCC^^ 4043* 
1215 SPPAVPQSFQVAKLHAPTGS 1234 



•246 

•227 

•213 A 

#211 

#209 

#12 

GenBank A • 

cons 

1235 GKSTRVPAA ^^^^^^^^^J^'^*®^^^ 



Figure 9 



wo 98/39031 PCT/US98«>4428 

1113-1-006 



#248 t 

#227 t 

#213 t 

#211 A 

#209 

#12 t 

GenBank a t 

cons t 



4104 AACCCCTCTGTTOCTGCJUlC GC TGGC gT TT Gg AX K r i TACAT^ 4163 
1255 NPSVAATL6FGAYMSRAKGV 1274 



#246 t 

#227 

#213 C 

#211 

#209 

#12 g 

GenBank 

cons • 

4164 GATCCTAATATCA06ACCGGGGTGAGAACAATTACCACTQGCAC0CC»^ 4223 
1275 DPNZRTGVRTZTT6SPZTYS 1294 



#248 t. 

#227 t 

#213 t 

#211 t 

•209 

•12 

GenBank C t 

cons t 

4224 ACCTACGGCAAOinx:CVlX;m A 0CGCG GC TGCTCAOGAGC0GC^ 4283 



1295 TYGXPLAPOGCSGGAYDZZZ 1314 



#248 

#227 

#213 

•211 

•209 

♦12 

GenBank ....C*. 

cons 

42B4 TGTGACGMTCCCACTCCACGGATGCCACATCCATCT^^ 4343 
1315 COBCBSTDATSZLGXGTVLD 1334 



•248 C 

•227 c 

•213 C 

•211 

•209 

•12 

GenBank c 

eons c 



4344 CAAGCAGAGACTGCGOGGGCGAGATTGGTTGTGCTOGCCACTGCTACCCCTC^ 4403 
1335 QABTAOARLVVLATATPPOS 1354 



Figure 9 



wo 98/39031 PCT/US9JW04428 

1113-1-006 



•248 

§227 \VV\\, ^ 

1213 • •••^ 

#211 

•209 

•12 v.v/.v/.'.\\\\\\\\\\ 

GenBank !!!!!!!!!!!!!!!!!!!!! A 

cons !!!! ' 

4404 GTCJlCTGTOrCCCATCCTAACATCGAGGAGCTTCOT 4463 
1355 VTVSHPNIEEVALSTTQEIP 1374 



•248 ..t 

•227 ..t 

•213 ..t *^ 

•211 ..t \\ ^ 

•209 ..t 

•12 . .t G!!!!!!!!]!!iiiiii|[[[ 

GenBank ..t [ " 

pcR-seq !!!!!!!!!!!!!!!! 

cons . ..t c 

4464 TTCTACGGCAAGGCTATCCCCCTCGAGGTGATCAAGGGG^^ 4523 
1375 FYGKAIPLEVXKGGRHLIFC 1394 

•248 

•227 \\\ 

•213 

•211 : 

•209 t 

•12 iiiii! 

GenBank 

pcR-eeq 

cons , 

4S24 CACTC»AAGAAGAAGTGOGACGAGCTCGCCGaylAGC^ 4583 
1395 HSKKK^CDELAAKLVALGINA 1414 
♦248 t G 

• 227 t ! G a" 

•213 t..c G 

•211 t..c G **" 

• 209 Q 

•12 i i i *G !!!!!!!!!!!!!!! ! 

GenBank t !!.!!!g !!!!!!! 

pcR-seq t !!!!!! !g! * 

cone t !!;!!!!g! 

4584 GTOGCCTACTACXSSCGGACTTCACGTGTCKa^ 4543 
1415 VAyyRGI.DVSVXPTNGDVVV 1434 

•248 

•227 

•213 t \\ 

#211 [ 

#209 \\ 

#12 

GenBank !!!!!!!!!!!!!!! 

pcR-aeq 

cons 

4644 GTGTCGACCGATGCTCTCATGACTGGCTTTACOGGCXSAC^^ 4703 
1435 VSTDALMTGPTGOPDSVXDC 1454 



Figure 9 



WO98«9031 ,/y . PCT/US98/04428 

1113-1-006 



AR 



t248 

#227 

#213 

#211 

#209 

#12 ° 

GenBank ^ 

PCR-6eq 

cons . ..•«..■...•...••••••••••••••■•••••"•••••••• * **** * * * * * ' * * • • • • 

4704 AACAaSTCTCT CA CTCAGACAGTCGRTTTCAGOCTIGACCCTACCTTC 4763 
14S5 MTCVTQTVDPSLDPTPTIBT 1474 

#248 * • 

•227 * Q 

#213 * • 

#211 * , 

#209 

il2 

GenBank * • 

PCR-aeq *• 

^*"*4764 ACrA C KTCCC C CAG^ ^^^^ 
1475 TTLPQDAVSRTQRRORTORG 1494 

#24B t.A 

#227 * 

#213 

#211 

#209 

#12 

GenBank t 

4824 AAGOCAGOOITCT^^ 
1495 KPGIYR FVAP6ERPSOMPDS 1514 

#248 ^ 

#227 t C 

#213 ^ 

#211 ^ 

#209 ® Z, 

#12 ^ 

GenBank ^ 

^fifid TCCGTCCTCTOTGACTOCTAT^ 4943, 
1515 SVLCECYDAOCAWYBLMPAE 1534 

•248 • 

#227 • 

#213 

#211 

#209 

§12 

GenBank 

4944 ACTACACTTAGGCT 5003 
1535 TTVRLRAYMHTPGLPVCQOH 1554 



Figure 9 



^0 9mmi PCT/US9JMM428 



1113-1-006 



•248 

•227 - = 

#213 

#211 "'g ^ 

1209 * ^ 

• 1 2 

GenBank G [ ' 

cons. ^ 

5004 S0S3 

1248 

#227 [][ 

#213 c ^- 

#211 c t 

#209 

#12 

GenBank - ! 

]] 

S0S4 CAOJC^UU.^^ ^^^^3 

#248 

#227 ^ , [ 

#213 

1211 " ; 

#209 

#12 

GenBazA !!!!!!!!!!!!! c 

cons 

5124 GCTAGGGCTCAAfi^^ 

1595 '^RAQAPPPSWDQMWKCLIRL 1614 

#248 

#227 []] 

#213 

#211 

#209 

#12 ' - i i ! 

GenBank 

cons 

lilt ^"""^"^^^^^ 5243 
•248 

#227 [ 

#213 

#211 

#209 

#12 iiiii.*!!!!!!!!!!!!!.'!! 

GenBank 

cons 



Figure 9 



wo 9W39031 PCT/US98/04428 

1113-1-006 



1248 

•227 

«213 

#211 

#209 

#12 

GenBank 

cons 

5304 GTCGTCACGAGCA 
1655 V V T S T 



#246 c 

#227 ^ 

#213 

#211 

#209 

#12 

OenBank c 

cons • • 

5364 CTGTCaACAG G CTGOGTG G TCATAGTOOGCAGCS A TTGT ^ 5423 

1675 LSTGCVVIVGRZVLSGKPAZ 1694 

#248 

#227 

#213 

#211 

#209 

#12 

GenBank • 

cons •••• 

5424 ATACXTGACAGGCft G GTTCTC TA CCAGGAGTTaa^TGAGAT^^ 5483 

1695 ZPDREVLYQEFDEMEECSQH 1714 



#248 

#227 

#213 

#211 

#209 

#12 

GenBank , 

cons 

5464 TTACCGTACATCGAGCAAGGGATGATGCTCGCTGAGCAGTTaJVGCAGAA 5543 
1715 LPYXEQGMMLAEQFRQXALG 1734 



#248 

#227 A 

#213 A c 

#211 A c 

#209 

#12 C G... 

GenBank 

cons A '11!^* * 

1735 LLQTASRRABVZTPAVQTNW 



5603 
1754 



Figure 9 



wo 98^9031 PCT/US98/04428 



AU 



1113-1-006 



#248 t 

#227 c.. ^ 

•213 a t ^ 

•211 t ^ 

#209 ^--C 

#12 c 

••• • ^ 

^ c 



SS04 ^^-^-5-™^^ 



•248 
•227 
•213 
•211 
•209 
•12 

GenBeuik 
cons. 



5664 WCGCGG^ 

1775 LAGLSTL.PGMPAiAsi.MAPT 1794 



•248 
•227 
•213 
•211 
•209 
•12 

GenBank 



V^ltf^^^f^^^^^^^ 5783 
1795 AAVTSPLTTGQTLLPHiLGG 1814 

♦248 t 

#227 

#213 

#211 ; _ 

•209 : 

•12 !!!!!!!!!!!!a! 

GenBank 1 • i ] 

eons . ^i!!!!]!! 

5784 TGG GT CXCTG C CCAGCTCGCCSGCCCCCGGTOCOG^ coai 
1815 WVAAQLXAPGAATAFVO aVl 183* 

•248 

•227 

•213 ; . ; ; ; 

•211 

•209 ; ^ 

•12 

GenBank aC* ..A 

c 

5844 GCTGGCGCaxrCATCGCCAGCGTTGGACrc^^ 59O3 
1835 AGAAI GS VGLGKVLVDl.LAC 1854 



Figure 9 



wo 98/39031 PCT/US98/04428 

1113-1-006 



AV 



#248 

•227 g 

•213 9 

•211 g C 

•209 

•12 

GenBank g 

cons 

5904 TATGGCGCCX5GCGTGGCGGGAGCTCTTGTAGCATO:AAGATCATGA^^ 5963 
1855 YOAQVAOALVAPKZMSGEVP 1874 

•246 : 

•227 

•213 

•211 

•209 

•12 

GenBanJc a C... 

cons • 

5964 TCCACGGAGGACCTGGTCAATCTGCTGCCCGCCATCCTCTC^ 6023 
1875 STBDLVNLLPA.ZLSPQALVV 1894 

•248 

•227 - 

•213 

•211 i 

•209 

•12 c 

GenBank Tt...T. Gt 

cons 

6024 GGTCTEX^GTCTGCGCAGCAATACTGCGCCGGCACGTTGOCCC^^ 6083 
1895 GVVCAAZLRRHVGPG BGAVQ 1914 

•248 

•227 

•213 ...t 

•211 

•209 

•12 

GenBank a...... 

PCR-seq 

cons 

6084 TGQATBAACOGGCTAATAGCCTTCGC CT CCCGGGGGAACailVmxJCX^^ 6143 
1915 WMHRLIAPASROH HVSPTHY 1934 

•248 

•227 

•213 

•211 

•209 

•12 

GenBank • • 

PCR-seq 

cons ■ 

6144 GIGCCGGAOAGCGATGCAOOCGCCOGCGTCACTGOCATACTC^^ 6203 
1935 VPESDAAARVTAZLSSLTVT 1954 



Figure 9 



PCT/CS98/04428 



1113-1-006 



AW 



#248 

#227 

#213 '"a t 

#211 "I • t 

#209 t 

#12 g.'."!!!.'!!!!!!!; ^ ^ 

GenBank g[ ^ 

pcR-seq g. ' ? 

cons . 



.t 

1355 QI'I'RRLHQWISSECTTPCSG 



6204 CAGC^CTGA^ 6263 



1974 



#248 . 

#227 

#213 

#211 

#209 • 

#12 !!!!!!!!!!! 

GenBank ! 1 ! 

pcR-seq 

cons • • ••••• 

«64 rr^rrfTTTrT^ im 

#248 

#227 

#213 

#211 

#209 

#12 

GenBeuik * 

pcR-seq i ' i i ! ^ ' 

cons * 

1995 '^'^'^M'^^^f^ 6383 

is»b I'KAKLMPQLPGIPPVSCQ. RG 2014 

#248 

#227 

#213 

#211 

#209 

#12 

GenBank i i i ! i i i !!!!!!!!!!! i * ' 

pcR-seq !!!!!!!!!! 

cons ' ' • 



Figure 9 



wo 98/39031 



1113-1-006 



PCT/US98/04428 



AX 



I24B 

•227 
1213 
#211 
#209 
»12 

GenBank 
PCR-seq 



2035 ITGHVKMGTMRi yGPRTCRN 2054 



#248 

#227 

#213 

#211 

#209 

#12 

GenBank 



^**"6504 ATOTOGAGTQGGicOT ^563 
2055 MWSGTPPIHAYTTOPCTPLP 2074 



#248 

#227 

#213 

#211 

#209 c 

#12 « 

. GenBank • • * 

*^*^564 GOTCCGAACTATAAOT^^ ^^^^ 
2075 A PKYKFALWRVSABBYVBIR 2094 



#248 

#227 

#213 

#211 

#209 

#12 

GenBank ^ ^ 

°°^624 CGGGTOOOGGAci^^ f??^ 
209S RVGDFHYVSGMTT DNLKCPC 2114 



#248 

#227 

#213 ^ 

#211 

§209 

#12 

GenBank 

^^6684 CAGATCCCATCGCOX^ ^'^^^ 
2115 QIPSPBPFTELOGVRLHRPA 2134 



Figure 9 



wo 98/39031 PCT/US98/04428 

1113-1-006 



AY 



»248 

#227 

»213 

• 211 

#209 

il2 

GexiBanlc • ^ 

cons • 

6744 CCCCCTTOCAAGCC C TTCXrK X SGGGAGGMXSTATC^ 6803 
2135 PPCK PLLREBVSFRVGLKSY 2154 



1227 

#213 e — 

#211 

• 209 

#12 

GexiBank 

cons 

6804 CCGGTGGGGTCOCAATTACCTTGCGAGCCCGAACCGGACGTAGCCGn^^ 6863 
2155 PVaSOLPCBPEPDVAVLTSH 2174 



#248 

#227 

#213 

#211 

#209 g.,a... 

#12 g..a... 

GenBank • • • 

ns 

6864 CTCACTGATCCCTCCCATATAACJUXrAGAGGCGGCCGGGAGAAOG 6923 
2175 LTDPSHITABAAORKLARQS 2194 



#248 

#227 

#213 A.t 

#211 t 

#209 t 

tl2 t 

GenBank - 

• 

6924 CCCGCTTCTATGGCCAGCTCCTCGGCCAGCCAGCTGTCC^^ €983 
2195 PPSKASSSASQLS.APSLKAT 2214 



#248 
#227 
#213 
#211 
#209 
#12 

GenBank 



6984 TGCACCXXXAACCATGACTCCCCniGACGCCGAGCTCATA^^ 7043 
2215 CTANHDSPDAELIBANLLWR 2234 



Figure 9 



wo 98/39031 



1113-1-006 



PCT/US98/(M428 



AZ 



»248 a 

#227 

•213 - 

#211 

*209 

#12 

GenBank • 

cons. . . , .* '_:.LILI 

7044 CAGGACaTGGCSCGGCAAOlTaiCCAGGGTTGAGTCAGAGAACAAAGTGGT^ 7103 

223S QEMGGNZTRV BSENKVVILD 2254 



#248 ^ 

#227 

#213 tv- 

♦211 

#209 

#12 ' 

GenBAnk 

cons 

7104 TCCTTCGATCCGCTTGTGGCAGAGGAOGATGAGCGGGAGG^^ 7163 
2255 SFDPLVABEOERBVSVPABZ 2274 



•248 

#227 

•213 e 

#211 • 

#209 

#12 c 

GenBank Ca e 

cons 

7164 CTGCGGAAGTCTCGGAGAT 

2275 LRKSRRPARALPVWARPDY 



7223 
2294 



•248 T 

•227 

#213 A 

• 211 A 

#209 gA 

#12 g 

GenBank . . . .T 

cons V* " 

7224 CCCCCGCTAGTAGAGACGTGGAAAAAGCCTGACTACXSAACCACCTGTGGTCCATOOC^ 7283 
2295 PPLVBTWKKPDYBP PVVHGC 2314 



•248 

•227 

•213 A 

#211 * 

#209 g 

•12 g 

GenBank 

cons 

7284 CCyrTACiCACCTCCAa aQ TCCXCTCC T Gt^^ 7343 
2315 PLPPPRSPPVPPPRKKRTVV 2334 



Figure 9 



wo 98/39031 



PCT/US98/04428 



1113-1-006 



BA 



i24B T 

•227 T 

•213 T 

•211 T 

•209 T c. 

•12 

GenBank 

cons T 

7344 CTCACCGAATCAACCCTACCTA 

2335 LTESTLPTALAELATKSFGS 



7403 
2354 



•24B C, 

•227 C 

•213 C, 

•211 C, 

♦209 C 

•12 C 

GenBank 

cons C 



7404 TCCrrCAACTTCC<»ATTACGGGCGACAATATGACAACATCCTCTGAGC 7463 
2355 SSTSGZTGDNMTTSSEPAPS 2374 



•248 

•227 

•213 

•211 

•209 

♦12 

GenBank 

cons 

7464 GGCTGCCCCCCCGACTCOyiCGTTGAGTCCTATTCTTC^ 7523 
2375 GCPPDSDVESYSSMPPLEGE 2394 



•248 

• 227 . 

• 213 C 

• 211 

•209 C 

•12 C 

GenBank G • 

PCR-seq 

cons c 1111 'rvii 

7524 CCTGGGGATCCGGATTTCAGCGACOCXSTCATOGTCQACX3(3TCAGTAOTGOGGCC^ 7583 
2395 PGDPDPSDGSW8T VSSGADT 2414 



•248 T 

•227 : 

•213 T 

•211 T 

•209 T 

•12 ..fl T 

GenBank T 

PCR-seq T 

cons 7 

7584 GJUUaATGTCGTGTGCTGCTCAATGTCrm 7643 



2415 EDVVCCSMSYTKTGALVTPC. 2434 



Figure 9 



wo 98/39031 



1113-1-006 



PCrA)S98/04428 



BB 



•246 

•227 

•213 

•211 

•209 

•12 t t 

GenBank g 

FCR-seQ 

cons -. 

7644 GCTOCGGMUSAACAMIAACTGCCCJITCAACGCACTGAGCAACTCGTTOCT^ 7703 
2435 AABBQKLPI NALSNSLLRHH 2454 



•248 

•227 

•213 g 

•211 9 

•209 g a 

•12 a.,g 

GenBank g A 

PCR*seq g 

cons .g.; • 



7704 AATCTGGTATATTCCACCACTTCAOGCAGTGCTTGCCAAAGGCAGAAGAAAGT^ 7763 
2455 NZ.VYSTTSRSACQRQKXVTP 2474 



•248 

•227 

•213 

•211 

•209 

•12 

GenBank 

PCR-seq 

cpns 

7764 GACAGACTGCAAGTTCT GG ACAGCCATTACCAGGACGTGCTC A AGGAGGTCAAAGCA^^ 7823 
2475 DRLQVLDSKYQDVLKEVKAA 2494 



•248 

•227 i 

•213 

•211 : 

•209 

•12 C 

GenBank ....G 

PCR-seq 

eons . «•.* • 

7824 GCGTCAAAAGTGAAGGCTAACTTCCTATCCGTAGAGGAAGCTTGCAGCC^ 7883 
2495 ASXVXANLLSVBBACSLTPP 2514 



Flgiire 9 



wo 98/39031 , PCT/US91W04428 

1113-1-006 



BC 



#248 t 

#227 

#213 

#211 t 

#209 

#12 

GenBank 

PCR-seq • . . • 

cons \ 

7884 CATTCAGCCAAATCCAAGTTTCXSCn^ATGGGGCAAAAGACGTCCGT^ 7943 
2515 KSAKSXFGYGAKDVRCHARK 2534 



#248 
#227 
#213 
•211 
#209 
#12 

G«nBan3c 
PCR'SeQ 



7944 GCCGTAGCCCACATCAACTCC G TGTCX^AAAGACCTTCTGGAAGACAGTGTA 8003 
2535 AVARINSVWKDLIiEDSVTPI 2554 



#248 C t a 

#227 C 

#213 C t a 

#211 C t a 

#209 C t 

#12 C t 

GenBank c • t • 

PCR-seq C t 

cons C t • 

8004 GACACTATCATCATGGCCAAGAACGAGGTCTTCTGCGrrCAGCCT^ 8063 
2555 DTI I MAKNB V FCVQPEKGGR 2574 

#248 C 

•227 

#213 

•211 

•209 

•12 

GenBfluik ' 

PCR-seq 

cons 

8064 AAGCaiGCTCGTCTCATCGTGTTCCCCGACCTQGGCGTGCGOGT^^ 8123 
2575 KPARLIVPFDLOVRVCEKMA 2594 

•248 g 

#227 

#213 g 

#211 g 

#209 g 

#12 g 

GenBank g t 

g 

8124 CTGTAOGACGTGGTTAGCAAACr rC C g CCTG GCO G T C ^ 8183 
2595 LYDVVSKLPLAVHGSSYGFQ 2614 



Figure 9 



wo 98^9031 ,^ / ^ PCT/US98/0442S 

1113-1-006 



•248 

#227 

#213 

#211 

#209 a 

#12 a 

GenBank 

cons 

8184 TACTCJtf:CAG6ACAGCG GG TTQAATTCCT0 G TGCAA0C^^ 8243 
261S YSPGQRVEFLVQAWKSRKTP 2634 



#248 T 

#227 T rr 

#213 T 

#211 T 

#209 T 

#12 T 

GenBank T 

eons T 



2635 MOFPYDTRCFDSTVTBSDXR 2654 



#248 

#227 

#213 

#211 

•209 

#12 C 

GenBanlc 

cons 

8304 ACGQAGGAGGCAATTTACCAATGTTGTGACCTGGACCOCCAAGCCC G OOTO^ 8363 
2655 TSEAZ YQCCDLDPQARVAZX 2674 



#248 

#227 

#213 

#211 

#209 

#12 

GenBank *. C 

cons *. 

8364 TCCCTCACaGftGAGGCTlTAavr W S GG aGCXCTC ^ 8423 
2675 SLTSRLYVGGPLTNSROEHC 2694 





















































8424 OGCTATCGCAOGTG 
2695 O Y R R C 




AACTAGCTGTGGTAACAOCCTCACT 
TSCGHTLT 


R A 8 0 V L T 



Figure 9 



• 



wo 98/39031 PCT/US98A)4428 

1113-1-006 



«248 T-C 

#227 T < 

#213 T 

•211 T 

t209 T A. 

#12 T 

GenBank C T 

cons T 



8484 TGCTACATCAAGGCCXXXSGCAGCCCGTCGAGCCGCAGGGCTCCAGGACl^^ 8543 
2715 CYXKARAARRAAGLQDCTML 2734 



#248 a.% 

•227 A 

#213 

•211 

•209 t 

•12 c 

GenBank - - • 

eons . • • 

8544 GT(nCTGGCGACGAC?rTAC?rCGTTATCTGTGJU^ 8603 
2735 VCGDDbVVZCBSAGVQEDAA 2754 



•248 c 

•227 c 

•213 c 

•211 c 

•209 C... 

•12 c 

GenBank c 

eons* ..•.•••.••••••c. •«••..• • - • * V" • ••••••• 

8604 AGCCTGAGAGCCTTTACGGAGGCTATGACCAGGTACTCCQCCCCCCCCGGGGACCCCCCA 8663 
2755 SLRAPTEAMTRYSAPPCDPP 2774 



•248 

•227 C 

•213 

•211 

•209 

•12 c t... 

GenBank 

cons • •.•••«•••••*•• 

8664 CAACCAGAATACGACTTGOMSCTTATAACATCATGCrC^^ 8723 
2775 QPEYDLBLITSCSSIIVSVAH 2794 



•248 g 

•227 

•213 g 

•211 g c 

•209 g t 

•12 gc 

GenBank g • • 

*''"*8724 GAOGGCGCTOGAAAAIUXSGTCT^ 8783 
2795 DOAGKRVYYLTROPTTPLAR 2814 



Figure 9 



wo 98/39031 PCT/US98/04428 

1113-1-006 



•24 B 

•227 

•213 

•211 

•209 

•X2 

GenBank 

cons . , 

8784 GCCGCGTGGGAGACAGCAAGACACACTCCAGTCAATTCCTGGCT 8843 
28X5 AAWBTXRHTPVMSWL6NZZM 2834 



•248 
•227 
•213 
•211 
•209 
•12 

GenBonk 



8844 TTTGCCCCOICACTGTGGGCGAGQATGATACTGATGACC^ 8903 
2835 PAPTLWARMILMTHPP5VLI 2854 

•248 G 

•227 i G 

•213 • G 

•211 G 

•209 C 

•12 c Gg 

GenBank c c G 

cons C 

8904 GCCAGGGATCAGCTTGAACAGGCTCTTAACTGTGAGATCTAOGCAGCCTGCl^ 8963 
2855 ARDQLEQALNCEIYAACYSZ 2874 

•248 G C 

•227 C... 

•213 C 

•211 C 

•209 .-..C 

•12 C 

GenBonk ••«.. C... 

cons • C. ... 

8964 GAACCACTGGATCTACCTCCAATCATTCAAAGACTCC AT G CC CTC^ 9023 
2875 BPLDLPPZZQRLHGLSAFLL 2894 

•248 A 

•227 A 

•213 A 

•211 A 

•209 A 

•12 A 

GenBonk A.t 

PCR-aeq • 

A 

9024 CAatfOTACTCTCCAGGTGMtfiTCAATASG G TOC O O C CATgOCT^^ 9083 
2895 B8YS PGBVNRVAACLRKLOV 2914 



Figure 9 



wo 98/39031 



BG 



PCT/US98/04428 



1113-1-006 



#248 " 

#227 

#213 ; ... 

•211 a 

#209 

«i2 9 ' — ^ — !!!!!!!!!! II ! .G. -a 

GenBank • ! ! ! ! ! ! II ! . ! . . -a 

FCR-seq • ' 

#248 illllll 

§227 

#213 *. 

#211 

#209 ■ 

#12 I!!IIIII!!IIII... 

GenBank A 

'^si» hi^Jii^i^^^i^^^ nil 

2935 GGRAAICGKYLFMll*VKir».u 

5 2 2? ^ : 

1213 

till ■.*.!■.'.'.'*■.'.!*.!'.". * 

5"' !'.*.!'.!'/. !*//.".-*.•••••••••• 

GenBank g...K 

PCR-seq 

29SS iclTPIAAX6Rl.BI.SOwrT*w 



Figure 9 



WO98AJ9031 PCT/US9JW04428 

1113-1-006 



5OTR , 



Kpnl 



Not I 3.j|n^ 



c'ei' E2 'NS2' NS3 NS4a,b NSSa' NS5b 



CMR 
12 

209' 
211 
213- 
227 
24S< 



39 



B 




+-t 




HH ill I n\ 






H 


— 










' i Ij 




H — ' 

H — ' 










H — ^" 




1 

=1^ 








p — 1 




b — 



vector 



Figure 10 



WO58/39031 , , PCT/US98/04428 

1113-1-006 



SOTR 



ORF 



C'EI' E2 'NS2' NS3 NS4a.b NS5a' NS5b 



G — [ 



GU — C 



T 
* 



GC. 



-r 



G- 
GA< 

GU- 
GC. 



-T" 

_*_ 



I 



TT 



TT 



"TT 



IT 



Ti 



"TT 



TT 



TT 



TT 



Figure 11 



INTERNATIONAL SEARCH REPORT 



Interoadooal application No. 
PCT/US98/04428 



A. CLASSIFICATION OF SUBJECT MATTER 
IPC(6) : A6IK 39/29; C12N 1/15. 1/21, 5/10. 5/14, 5/16; C12Q 1/70; 
US CL : 424/93.6. 228.1; 435/5. 252.3, 254.2, 320.1. 325. 348, 419 

According to Intematioaal Patent Classificatioo (IPC) or to both national classification and IPC 


& FIELDS SEARCHED 


Mimmuin documentation seaivhed (dassificaU'on system followed by classification symbols) 
U.S. : 93.6. 199.1, 228.1; 435/5. 6, 91.33. 172J. 235.1. 236. 252 J. 252J3. 254.2. 254.21. 320.1, 325. 348, 363, 
370.419 


Documentation seaivbed other than nmimum documentation to the extent that such documents are included in the fields seafched 


Electronic data base eonsulted during the international search (naj 
nease See Extra Sheet 


me of data base and, where practicable, searph terms used) 


C DOCUMENTS CONSIDERED TO BE RELEVANT 


Category* 


Citation of document, with indication, when appropriate, of the relevant passages 


Relovant lo claim No. 


T 


CLARKE et al. Developments in Hepatitis C during 1996-1997. 
Expert Opinions in Therapeutic Patents. September 1997, Vol. 7, 
No. 9, pages 979-987, especially pages 983, column 2,-984. 


1-21, 25, 26, 30- 
33, 40-43. 48, 49. 
57, 58, 67, and 
68. 


X 


YOO et al. Transfection of a Differentiated Human Hepatoma Cell 
Line (Huh7) with In Vitio-Transcribed Hepatitis C Vims (HCV) 
RNA and Establishment of a Long-Term Culture Persistently 
Infected with HCV. Journal of Virology. January 1995, Vol. 69, 
No. 1, pages 32-38, sec entire document 


1, 2, 4, 5, 9, 14- 
16, 20, 25, 26, 
32, 33, 40-42, 57, 
and 58.. 


X 


US 5,106,726 A (C. Y. WANG) 21 April 1992, column 36, lines 
35-68. 


67 and 68 


n Further doeuments are listed in the contbuation of Box C. 


1 1 See patent lamily annex. 




• SptUk alagcriM of citoA dpwntK 

*A* dosin«Btd0fiiiB«1faosaomlrtitooftlMaftwhiolibiiol«omidmd 

to U of ptfliootar mbvnM 
■B* oofaraoomcetpwbtthadoeortltof tboiotBranioort riUegditB 

cited to Mt^lish tho pufriiftltff* 4tte of Boollier oitetioD or other 
ipMial PMBoo (n qiMifiad) 

*0* doemool nftimv lo m onl diwloram, um, flxhibiiioa or odwr 
•P- docunirtt p<irfiib«a prior to ai» MitoimtioBri fiHng date but httrthon 


*T* bterdeoiia«ntpubIiilMdaitolhoiiit«nwtienairtliii8d«l6orpr^ 
duo iad not in oonflict «ilb lb« opplintSoa but cited lo umlefiUud 
iba prioeiplt or tficoiy nadn^jrins lite imrcalioa 

•X' doetim«iit of ptfli«il«r totevnos; ifao oteimod iovaaltoa enaot bt 
owBiidwod oortl or OMinofcb* oomidwod to imrohro u iovtativo ttep 
wbcn ttio doounrat ti taktn olono 

*y" doeunraft of putieulsr foloruica; dM oloimdd inveoiion enoot b« 
comiderad to tnvohro u toventiv* ttap wb«o tbo dooBmnt if 
cocobtnad with ana or mora other iuch docnmoDlB, tuofa oombtaatioo 
being obvioua to o penon ikiUad m ttia ait 

*A* doeomaBt nambar of dw aomo pitaat fomilf 


Date of tlie actual completion of the intematiottal search 
29 MAY 1998 


Date of mailmg of the international se 

H 6 JUL 1998 


sfch lepoft 


Ntme and mailing address of the ISA/US 
Commisstoacr of Patents and TndcoMrics 

Box per 

Washing^ D.C 20331 
Facsimile No. 003) 305-3230 


Aaiborized oi&cM' j 

THOMAS O. LARSON, VWal,^_J^^'4^ 
Telephone No. (703)308-0196 /f ^fO^ 



Fofm PCT/rSA^lO (second sbeetXJnty 1992)# 



m 



INTERNATIONAL SEARCH REPORT 
I Ob..rv.fl«».. wher. eeruln cl.lm. w.r. tou^i «,.«.r.h.bl. (C«.ii»..rto. »f »«. 1 of n»t ri.».t) 



Intemational appticatioii No. 
PCT/US98/04428 



1 Thb «port !«. not bee e,..blished « aspect of c««in cU»»s »«d« Arfde 17aX«) for .be fbBowing ««oo« 



blwuIe^Aly leUte to subject matter »ot requited to be searched by diis Authority, namely: 



I 2. Claims Ko«.: 3. 6. 7. 8, 17. 18. ud 59 



bir«*;».a;'.oiiVi;i..«^ 

„ extent that no meaningfiil intematiooal seaicb esn be earned out. apeoficUiy: 



^1. 1 • .„ Jn.». to mecific SEO ID. NOS. but • Seunnee Listing b comtMler leadable fonnat has not been 
I^ITvi^Jas^vindbyT^ P^^^^^ CUim. .0.16. 19-21.32.33.40^3. 48.and 

49 have 'only been searched to the eictent possibte without . sequcnee search. 



iofRttle6i4(a); 



I This Iot«.«ional Seeching Authority found multiple mvenlio.. in this intemntional M^lieatio.. as fellows; 
Please See Extra Sheet 



1. □ Asall«q»i«d«ldition.l.e.n*fce.wen.timelypaidbythe.pplicantthisintemali^^^ 



□ A. all searchable eWm. could be seatehed wUhout effort justifying m. -Wirional fee. this Authority did not invite payment 

^■"^ of any additional fee. 

1 3 n A.o«lysomeofUK,ev.iwl«MW"««««»hfi«*«»'i'"'y«»"'^**'W»'*«"'^^ 

' " ^ naiy those claims for which fees were paid. spedfieallydaimsNoa.: 



4 ra No required additional .e«ch foes we« timely paid by the appBcmtt Coos«pm..ly. this intemationd se«ch report is 
I ^ mstricted to Hie invention first mentioned in the claims; tt » covered by claims No..: 
1-21. 25. 26, 30-33. 40-»3. 48. 49. 57. 58. 67 and 68. 



Remark on Proteat Q The additional search fees were accompanied by the applicant's protest. 

ipanied the payment of additional search fees. 



□ 



I protest I 



Form PCT/ISAfllO (contianation of first sheet(l)XJuly 1992)* 





INTERNATIONAL SEARCH REPORT 



Interaational applieaUoa No. 



PCTAJS98/D4428 



B. FIELDS SEARCHED 

Eleotronie data bases coo^lted (Name of data base and where piacticable lenns used): 
APS. STN (Biosis. CAplus, INPADOC. LifcSci, Medline. WPIDS) 

Search Terms: HCV, Hepatitis C Virus, infectious clone, functional done, tnfeetions transcript, recombinant, viral 
vector, vector, vaeeine, kit. cell line, permissive, leplicatioa, infection. 

BOX II. OBSERVATIONS WHERE UNITY OF INVENTION WAS LACKING 
This ISA found multiple inventions as follows: 

This application eontatns the following inventions or groups of inventions which are not so linked as to form a single 
mventive concept under PCT Rule 13. L 

Group I. claim(s) 1-21. 25. 26. 30-33. 40-43. 48. 49. 37-59, 67, and 68 drewn to genetically engineer HCL clones, 
vectors, cells, vaccines, and kits compiising said clones, method of using clones to identify cells permissive for HCL 
replication and method for making HCL clones. 

Group II, claims 22, 27-29 and 34-35, dmwn to method of infecting or identilying animals permissive for HCL infection. 
Group UI. clatm(s) 23 and 24, drawn to method of selecting HCL having adaptive mutations. 
Ofoup IV. claim 36, drawn to method of making viral particles in an animal. . 
Group V. claims 37-39, drawn to method of making viral particles in cultured eell lines. 
Group VI, claims 44-47, dmwn to method of making antibodies. 

Group VII, claims 30-52, drawn to method of screening ibr agent thai modulates HCL replication in animals. 
Group Vni. claims 53-56 and 60. dmwn to method of screening for agent thai modulatea HCL m&etion in cell lines. 
Group I. claims 61 and 62, drawn to methods of detecting HCL antibodies by bmding to virsl panicles. 
Group X, claims 63-66. drawn to methods of detecting HCL infection in a sample using engineered cells. 



The inventiona listed as Groups I to X do not relate to a single mventive concept under PCT Rule 13. 1 because, under 
PCT Rule 13 J. tiiey lack the same or corresponding special technical feamres for the following reasons: The method of 
determining luity of mvention under PCT Rule 13 permits, in addition to sad independent claim to a product, an 
mdepeadent claim to a proeess of making and an mdependent claim to a process of using said product In the instant 
case unity of invention exists between the compositions comprising engineered HCV nucleic acids, the method of 
making an engineered HCV nudeic add (daims 57-59) and the method of unng engineered HCV nuddc acids to 
identify snd/or infect permissive cdl Imes. Independent claims to additioaal methods of using engmeered HCV nudeic 
acids of groups II-X are not considered to be linked by a "special technical feamre". Therefore, unity of invention does 
not exist between the method of using engineered HCV nuddc acids to identify snd/or infect permissive cell lines of 
group I, the method of using engineered HCV nucleic acids to identify and/or infect animals permissive for HCV 
infection of group II. the method of selecting adaptive mutations of group III, the method of making viral particles in an 
animal of group IV. the method of making viral particles m a cell line of group V. the method of making antibodies of 
group VI, the method of screening for agents thai modulate HCV replication in animals of group VII. the method of 
screening for agents that modulate HCV replication in a cell line of group VIII, the method of detecting antibodies of 
group IX, and the method of detecting HCV mfection using engineered cdl of group X. Therefore, the claims of groups 
MX are not so linked by a special technical feature within the meaning of PCT Rule 13.2 to form a single inventive 
concept. 



Forai PCT/lSA/210 (extre sheetXJuly 1992)* 



# 



